Modeling and solving a multimodal routing problem with

Modeling and solving a multimodal routing problem with
Modeling and solving a multimodal routing problem with
timetables and time windows
Luigi Moccia∗†, Jean-François Cordeau‡, Gilbert Laporte§,
Stefan Ropke§, Maria Pia Valentini¶
January 29, 2008
Abstract
This paper studies a routing problem in a multimodal network with shipment consolidation options. A freight forwarder can use a mix of flexible-time and scheduled transportation services. Time windows are a prominent aspect of the problem. For instance,
they are used to model opening hours of the terminals, as well as pickup and delivery time
slots. The various features of the problem can be described as elements of a digraph and
their integration leads to a holistic graph representation. This allows an origin-destination
integer multi-commodity flow formulation with non-convex piecewise linear costs, time
windows, and side constraints. Column generation algorithms are designed to compute
lower bounds. These column generation algorithms are also embedded within heuristics
aimed at finding feasible integer solutions. Computational results with real-life data are
presented and show the efficacy of the proposed approach.
Keywords: multicommodity flow problem, time windows, transportation timetable, mutimodal transportation, column generation, non-convex piecewise linear cost function.
∗
Dipartimento di Elettronica, Informatica e Sistemistica, Università della Calabria, 87036 Rende (CS) - Italy [email protected]
†
ENEA - Ente per le Nuove Tecnologie, l’Energia e l’Ambiente - C.R. Trisaia, s.s. 106 Jonica km 419+500, 75026
Rotondella (MT) - Italy
‡
Canada Research Chair in Logistics and Transportation, HEC Montréal, 3000 chemin de la Côte-SainteCatherine, Montréal, Canada H3T 2A7 - [email protected]
§
Canada Research Chair in Distribution Management, HEC Montréal, 3000 chemin de la Côte-Sainte-Catherine,
Montréal, Canada H3T 2A7 - [email protected], [email protected]
¶
ENEA - Ente per le Nuove Tecnologie, l’Energia e l’Ambiente - C.R. Casaccia, via Anguillarese 301, 00060 Roma
(RM) - Italy - [email protected]
1
1
Introduction
We describe and solve an operational problem faced by a freight forwarder. Given a set of
origin-destination transport requests, one must optimally route these requests in a multimodal
network. We assume that the freight forwarder does not operate a vehicle fleet, but can access
a heterogeneous set of transportation services. These services can be classified according to
two main characteristics: type of departure time, and cost function. We differentiate between
timetabled services and time-flexible services. Usually, rail and short sea shipping modes are
operated with fixed departure times while trucks have flexible departures. Some services allow consolidation of shipments between two terminals. A terminal is where a transfer can take
place between modes or between different vehicles of the same mode. Consolidation enables
fixed costs sharing. This effect is captured by piecewise linear (PL) cost functions that depend
upon the total service load. These cost functions are non-convex and, in general, non-concave.
Other types of services do not allow consolidation and their cost function is thus that of the
single shipment. We therefore distinguish between consolidation and dedicated services. Consolidation services present multiple capacity constraints, e.g. volume, weight, train length, etc.
Dedicated services are not viewed as capacitated because they either are feasible or not considered for a given shipment. The pickup of a transport request is done by selecting between
multiple time windows, e.g. between 08.00-10.00 AM of day one, or 08.00-10.00 AM of day
two of the planning horizon. Similarly there are multiple time windows for the delivery. The
chosen route must respect the opening time windows of the terminals. Because of these characteristics, namely multimodal, multiple capacity constraints, and multiple time windows, we
denote this problem as the M++ Routing Problem (M++RP).
While this paper attempts to set a general framework for multimodal routing, it stems from
a real-life application. A freight forwarder operating on the Italian market can integrate rail
and road transportation services. The largest customer requires shipments from factories located in Northern Italy to regional distribution warehouses in Central-South Italy. A factory
must send many different shipments: they are distinct because there can be different destinations or incompatible pickup and delivery time windows. Block trains can be activated between
stations. At the arrival station the shipment can be sent to its final destination by truck. The
freight forwarder can consolidate shipments by using trucks between factories and intermediate platforms. These nodes are then connected to the warehouses by dedicated truck services.
2
However, the dedicated origin-destination shipment by truck is always a possible option. In
addition, consolidation by truck can take place between a factory and a loading train station.
The remainder of the paper is organized as follows. Section 2 discusses the relevant literature. Notation and graph construction procedures are introduced in Section 3, while Section
4 describes mathematical models and their properties. Section 5 outlines algorithms to obtain
upper and lower bounds, while Section 6 presents computational results. Conclusions are reported in Section 7.
2
Literature Review
We first review freight transportation surveys in Section 2.1 in order to position the problem
within the relevant literature. Available methodologies for the representation of multimodal
networks are discussed in Section 2.2. The literature dealing with problems similar to ours is
assessed in Section 2.3, while Section 2.4 considers the optimization methodologies that are
relevant to our approach.
2.1
Freight logistics surveys
Crainic and Laporte (1997) extensively review the optimization models for freight transportation. A main distinction can be established between strategic-tactical and operational models that respectively consider a national or an international multimodal network, such as in
the Service Network Design Problem (SNDP) (see Crainic (2000)), and the unimodal distribution
management models that are variants of the Vehicle Routing Problem (VRP) (see Toth and Vigo
(2002)). Macharis and Bontekoning (2004) present a freight logistics literature review focused
on intermodal transportation. They propose a classification based on two criteria: the type
of operator and the length of the problem’s time horizon. Four types of operators are distinguished: drayage operators, terminal managers, network planners, and intermodal operators.
The time horizon criterion results in the classical differentiation of strategic, tactical, and operational levels. In this classification matrix of twelve categories, the M++RP would correspond to
operational problems faced by an intermodal operator, since the problem can be stated as the
selection of routes and of services in a multimodal network. This problem category, according
to Macharis and Bontekoning (2004) and to our own updated survey (see Section 2.3), is one
of the least studied. Bontekoning et al. (2004) review the intermodal literature related to the
3
rail-truck combination. This paper, like the previous one, highlights the need for more research
on operational problems faced by intermodal operators, like the optimal route selection. Container based transportation is the key enabler of intermodalism because of various advantages
like higher productivity during the transfers, less product damage, etc. Consequently, Crainic
and Kim (2006) focus their recent intermodal logistics literature review on the container related aspects of the transportation industry. In particular, empty container repositioning and
container terminal management problems are thoroughly discussed.
2.2
Representing multimodal networks
A multimodal network involves services that can operate on the same infrastructure. For instance, trucks of different capacities using the same highway link. Furthermore, important
operations, like mode transfers, are not represented on the geographical, or physical, network.
These require the creation of a so called “virtual network” where links relate to different operations in the multimodal chain. In multimodal freight transportation models the virtual network
definition dates back to the early 1990’s (Guélat et al., 1990; Crainic et al., 1990). Jourquin et al.
(1999) present a methodology for the automatic generation of virtual networks in a Geographic
Information System (GIS). Similarly, Southworth and Peterson (2000) deal with the digital representation of a multimodal and international freight transportation network. They extend
a commercial GIS to handle intermodal transfers and network access and egress. The above
cited methodologies were aimed at tactical and strategic planning. Accordingly, operational
aspects, like time synchronization, were not considered. Our contribution, with respect to this
literature, consists in procedures for virtual networks with embodied time synchronization.
2.3
Applications related literature
Barnhart and Ratliff (1993) introduce two models to determine a minimum cost route combining truck and rail services. The first model holds when the rail service has a per trailer cost,
while in the second model the rail cost is per flatcar with a maximum of two trailers per flatcar.
Time related constraints are added for trailer availability and maximum allowed delivery time.
Under these hypotheses the minimum cost route for the first model is easily found by means of
a shortest path algorithm, while for the second model a non-bipartite weighted matching procedure is applied. Min (1991) studies an international intermodal supply chain. He presents
a chance-constrained goal programming model to select the intermodal mix that minimizes
4
costs and risks while satisfying on-time service requirements. Boardman et al. (1997) sketch
a decision support system that selects the multimodal route by means of a k-shortest path algorithm. Ziliaskopoulos and Wardell (2000) present an algorithm for computing an optimal
path in a multimodal network when travel and transfer times are dynamic. They also consider
timetabled services. Their approach is aimed at realistic simulation of large networks both in
freight and in passenger transportation. Aldaihani and Dessouky (2003) deal with the problem
of mixing timetabled and flexible-time transportation services at the operational planning level
in paratransit. They propose a tabu search heuristic as solution approach.
The work of Chang (2007) is the closest to our problem. Chang studies an intercontinental
transportation network where there are timetabled transportation services, economies of scale
represented as piecewise linear concave (PLC) cost functions, and delivery time windows. The
main difference with respect to our problem is that he assumes that commodity flows can bifurcate. This assumption is reasonable when considering large shipments at the intercontinental
scale. We deal, in contrast, with an inter-regional network where routing a shipment between
more than one path is not an acceptable practice. Furthermore, our solution approach is different. Chang applies a Lagrangean relaxation scheme introduced by Amiri and Pirkul (1997) for
a multicommodity flow problem with PLC costs. Other differences are that we consider more
general cost functions which are not necessarily concave, and multiple time windows. We
note that the multiple time windows feature is a relatively recent issue in the vehicle routing
literature (Xu et al., 2003; Ibaraki et al., 2005; Caramia et al., 2007).
2.4
Methodology related literature
From a mathematical programming point of view our problem is an extension of capacitated
network design models (for a review see, e.g. Gendron et al. (1998)). Since a shipment must
use a single path from origin to destination in a capacitated network, it is similar to the OriginDestination Integer Multi-Commodity Flow Problem (ODIMCFP) introduced by Barnhart et al.
(2000). In fact, we will formulate the M++RP as an ODIMCFP with time windows, PL cost functions, and resource consumption side constraints. Analogously to the ODIMCFP, the M++RP
can be modeled with path variables and solved by column generation. For a survey on column
generation see, e.g., Desaulniers et al. (2005). Because of the M++RP special structure, the pricing problem is a Shortest Path Problem with Time Windows (SPPTW) studied, e.g., in Desrochers
and Soumis (1988), and in Feillet et al. (2004). The literature dealing with piecewise linear
5
costs is also relevant to our problem. Kim and Pardalos (1999) introduce a dynamic slope scaling algorithm to heuristically solve the Fixed Charge Network Flow Problem (FCNFP). Kim and
Pardalos (2000) applied similar algorithms to the Concave Piecewise Linear Network flow Problem
(CPLNFP). In fact, through an arc separation procedure, the CPLNFP can be transformed into
an FCNFP on an extended graph. A more refined algorithm variant, which employs a trust
interval technique, was also presented in the same paper. The dynamic slope scaling concept
was exploited by Crainic et al. (2004) to solve the multicommodity version of the FCNFP. The
authors propose a heuristic algorithm that combines slope scaling, Lagrangean relaxation, intensification and diversification mechanisms as in metaheuristics. Croxton et al. (2003a) prove
that three textbook mixed-integer linear programming (MILP) formulations of a generic minimization problem with separable non-convex piecewise linear costs are equivalent. Their LP
relaxations approximate the piecewise linear cost function with its lower convex envelope. Independently, Keha et al. (2004, 2006) derived a similar result. Croxton et al. (2007) present valid
inequalities based upon variable disaggregation for network flow problems with piecewise linear costs. Croxton et al. (2003b) study an application, the merge-in-transit problem, where the
above mentioned technique shows its efficacy.
3
Virtual Network Representation
In this section we introduce the notation and the constructive procedures to represent the
M++RP on a digraph. Sections 3.1 to 3.5 describe the procedures for the problem’s main features, and Section 3.6 summarizes the digraph notation.
3.1
Pickup and delivery time windows
Let K be the set of shipments, or commodities, to be routed during the planning horizon. Each
shipment k ∈ K is characterized by a set Ω(k) = {1, ..., |Ω(k)|} of pickup time windows, and a
set Γ(k) = {1, ..., |Γ(k)|} of delivery time windows. The penalized delivery time windows are
grouped in the set Γp (k) ⊂ Γ(k) and there is a cost cγ , γ ∈ Γp (k), associated to each of them, i.e.
we assume a staircase penalty cost function upon the arrival time. With this notation we can
construct a first part of the directed graph G = (N, A) on which the problem is defined. We
denote by O the set of origin nodes, and by D the set of destination nodes. The origin node of a
shipment k ∈ K is denoted by ok ∈ O while the destination node is dk ∈ D. We then add |Ω(k)|
6
nodes representing the pickup time windows and |Γ(k)| nodes for the delivery time windows.
We maintain the previous notation and the sets Ω(k) and Γ(k) indicate also the pickup and
delivery time windows node sets. An origin node ok is linked to the Ω(k) nodes by arcs with
zero traveling time and zero cost. The notation is similar for the destination node, but the arcs
between the Γp (k) nodes and dk have a cost cγ , γ ∈ Γp (k). The resulting portion of the digraph
is illustrated in Figure 1, where a time window interval is denoted as [ai , bi ] for a generic node
i ∈ Ω(k) ∪ Γ(k).
k
k
k
ω1k
k
γ1k
k
k
k
[aγ2 , bγ2 ]
[aω2 , bω2 ]
ok
k
[aγ1 , bγ1 ]
[aω1 , bω1 ]
dk
γ2k
ω2k
c
k
ω|Ω(k)|
[a
k
ω|Ω(k)|
,b
k
γ|Γ(k)|
[a
]
k
γ|Γ(k)|
,b
k
γ|Γ(k)|
]
k
γ|Γ(k)|
k
ω|Ω(k)|
Ω(k)
Γp (k)
Γ(k)
Figure 1: Portion of the digraph representing the pickup and delivery time windows for a
shipment
3.2
Accessing a terminal and timetabled consolidation services
The previously described procedure results in a digraph which is still separable by shipment.
However, the representation of consolidation services introduces nodes and arcs that can be
used by more than one shipment. Let us construct, as an example, the access to a terminal i
and the selection of a Timetabled Consolidation Service (TCS) from terminal i to terminal j. The
physical network could be synthetically represented by two nodes, i and j, and one linking arc.
The virtual network must take into account many operational characteristics, like:
7
• opening hours of the terminals;
• transfer times and costs that depend upon the entering mode or vehicle type, the departing mode, and the shipment requests;
• timetables of the available services.
PHYSICAL NETWORK
i
j
VIRTUAL NETWORK
[ai1 , bi1 ]
Mode 1, Day 1
e1
[ai1 , bi1 ]
Mode 2, Day 1
(tke1 i1 , cke1 i1 )
[di1 ]
(ti1 j1 , gi1 j1 (qi1 j1 ), Wi1 j1 )
i1
j1
e2
[ai2 , bi2 ]
Mode 1, Day 2
e3
[ai2 , bi2 ]
Mode 2, Day 2
[di2 ]
(ti2 j2 , gi2 j2 (qi2 j2 ), Wi2 j2 )
i2
j2
e4
Figure 2: Example of exploding the physical network to the virtual one
Figure 2 depicts the following example: two terminals linked by two TCSs, e.g. shuttle trains,
in a two-day planning horizon, and with two possible entering modes, e.g. truck and rail.
We designate the departure of the two services by two nodes, i1 and i2 , with collapsed time
windows. A collapsed time window for a generic node i is such that ai = bi = di , where di
indicates the departure time. The entrance to terminal i requires, in our example, four nodes to
describe the possible combinations between modes and opening hours in the two-day planning
horizon. We assume one opening time window per day: [ai1 , bi1 ] for day one, and [ai2 , bi2 ] for day
two. We link the entrance nodes with the compatible departure nodes. We can have an arc
between an entrance node e and a departure node i only if ae + mink∈K tkei ≤ di , where tkei is the
required time for the shipment k to move between the nodes e and i. In the example of Figure
2 we assume that the service represented by the node i1 departs on day one and is therefore
8
compatible only with entrance nodes e1 and e2 which represent day one arrivals. Conversely,
the service i2 departs on day two and is compatible with all the four entrance nodes e1 to e4 .
In this representation, costs and times on the arcs between entrance and departure nodes
are related to three attributes: ingoing mode, outgoing mode, and shipment. The arc leaving a
departure node connects with the corresponding arrival node and presents a PL cost function
which depends upon the cumulated quantity, e.g. for a generic arc (i, j) we have gij (qij ), where
k , i.e. the arc specific quantity of
qij is the arc load. The arc load is the sum of the quantities qij
the shipment k, for all the transport orders that use (i, j). As we will show in Section 4.2, the
k
MILP model of the PL cost function will ensure that the capacity constraint related to the qij
unit of measure will also be satisfied. For notational simplicity let us assume that we have to
take into account only a second type of capacity constraint on the arc (i, j). For instance, the
k values express volume while there is an additional capacity constraint for the weight. We
qij
k the shipment requests. Therefore, consolidation
denote by Wij the arc capacity and with wij
service arcs will be modeled as capacitated.
3.3
Flexible consolidation services
Flexible consolidation services (FCSs) are not constrained by fixed departure dates. However, we
still need to take into account the synchronization with the arrival of the assigned shipments.
First consider that FCSs usually have a time window for the departure. Therefore, we model
an FCS with a sufficiently large set of nodes with fixed departure times equally spaced along
the departure time window. The optimal routing will look for consolidation and, consequently,
a minimal number of departure nodes will be used. We implicitly assume that the number
of available vehicles of the FCS is large enough to cover the assigned transportation demand.
Since FCSs relate to trucks, this hypothesis is not restrictive.
3.4
Dedicated services
A dedicated service arises whenever the freight forwarder cannot, for technical or managerial reasons, share transportation means between different shipments. In a more general way
we define a dedicated service whenever the cost function is separable by shipments. Usually
dedicated services are performed by truck, and therefore they have flexible departure times.
Transportation by truck between the shipment origins to terminals or from terminals to destinations, i.e. drayage operations, are examples of this type of service. These flexible dedicated
9
services (FDSs) are the easiest to represent on a digraph: an arc between two geographically
related nodes with traveling time and cost that are shipment dependent. However, in a multimodal network we can have transportation services whose cost function is that of an FDS, but
having timetables or guaranteed arrival times. Examples are short sea shipping line services
where there are scheduled departures and arrivals at ports. Capacities of these transportation
services are relatively large with respect to the requests of a freight forwarder. Thus, the negotiated contract for this type of services has a per container cost. These timetabled dedicated
services (TDSs) can be represented with nodes featuring collapsed time windows, as in the TCS
case. However, the arcs are uncapacitated and have separable cost functions, i.e. ckij values.
The last type of time synchronization that we consider, the guaranteed arrival time, requires a
departure node with a time window for the consignment and an arrival node with a collapsed
time window for the arrival time.
3.5
Direct origin-destination arcs
The digraph on which the problem is defined contains an arc between the origin and the destination nodes for each shipment. A direct origin-destination arc yields the minimum cost
route that can be obtained by dedicated services only. This cost could be computed solving an
optimization problem in a separated graph for each shipment. It is easy to see that we can determine the minimum cost route by applying a shortest path algorithm with time windows on
the digraph of the dedicated services. Because of the availability of effective SPPTW algorithms
and the relatively small size of the reduced digraph, the problem is not difficult. In practice we
have to compare a few alternatives. We have to choose between direct origin-destination transportation by truck, or a combination, if available, of drayage operations and dedicated services
between terminals. The existence of these “virtual” direct origin-destination arcs, (ok , dk ), helps
our solution approach, as explained in Section 5.5. Additionally, we have a cost estimate when
shipment consolidation is not considered.
3.6
Digraph node and arc sets
To summarize, the M++RP can be represented on a digraph with the following nodes:
• origin and destination nodes, sets O and D;
• pickup and delivery nodes with time windows, sets Ω = ∪k∈K Ω(k) and Γ = ∪k∈K Γ(k);
10
• terminal entrance nodes with time windows, set N e ;
• scheduled departure nodes or time guaranteed arrival nodes, i.e. with collapsed time
window, set N d ;
• consignment nodes with time windows, set N c .
Let N be the union of the above listed node sets. For notational convenience we denote by N tw
the set of the time windowed nodes, i.e. N tw = Ω ∪ N e ∪ N c ∪ Γ. The arc set A ⊂ N × N is
defined according to operational constraints as outlined in the above examples. We distinguish
two disjoint subsets of A:
• the set Av , where the arc cost function is separable by shipment, i.e. we have costs ckij >
0, ∀(i, j) ∈ Av ; these arcs represent transportation costs of dedicated services, transfer
costs inside terminals, penalty costs for late arrivals, etc.
• the set Apl , where the arc cost function gij () is PL and depends upon the total arc load qij .
4
Mathematical Models
In this section we model the problem as an origin-destination integer multicommodity flow
problem with PL costs, time windows, and side constraints over the digraph G defined in the
previous section. We first introduce, in Section 4.1, the variables and constraints needed to
model the PL cost functions. Here we discuss also properties relevant to our approach. We
then present a node-arc formulation, F1 , which is described in Section 4.2. However, since
F1 results in very large integer programs, we devise a decomposition scheme using a path
based formulation, F2 , which is presented in Section 4.3. An approximate compact formulation,
F2 L, is introduced in Section 4.4 and its relevance is discussed in Section 4.5. Finally, valid
inequalities from the literature are adapted to both formulations F1 and F2 in Section 4.6.
4.1
Modeling the PL cost functions
We have in our problem |Apl | PL cost functions, possibly distinct. We indicate with Sij =
{1, ..., |Sij |} the set of linear segments of the cost function gij , (i, j) ∈ Apl . Each segment s ∈ Sij
s−1
s.
has a variable cost or slope, csij , a non-negative fixed cost, fijs , and the breakpoints, rij
and rij
s the slope of the line joining the origin with the point of segment maximum
We denote by αij
11
s = g (r s )/r s . We do not assume continuity, but we require that the function be
flow, i.e. αij
ij ij
ij
0
lower semicontinuous: gij (q) ≤ lim inf q0 →q gij (q ). To a zero arc load corresponds a zero cost:
gij (0) = 0. An illustration is provided in Figure 3, where, for notational simplicity, we drop the
subscript ij. We use the so called multiple choice model (MCM) to represent these PL functions
g
f |S|
cs
fs
f2
∆f 2
f1
αs
r1
rs−1
r2
rs
r|S|−1
r|S|
q
∆r2
Figure 3: Notation of the PL cost function
(see, e.g., Croxton et al. (2003a)). The MCM employs the following variables:
s ∈ <+ , ∀(i, j) ∈ Apl , ∀s ∈ S , expresses the arc load on segment s; if ls > 0 implies
• lij
ij
ij
u = 0, ∀u ∈ S \ {s} and r s−1 ≤ ls ≤ r s ;
lij
ij
ij
ij
ij
s ∈ {0, 1} , ∀(i, j) ∈ Apl , ∀s ∈ S , where y s = 1 if the arc load belongs to the segment s
• yij
ij
ij
s = 0.
of the cost function gij ; otherwise yij
Then, dropping the subscript ij, for a given arc load q, we obtain the following MCM mixedinteger linear formulation for g(q):
minimize
X
(cs ls + f s y s )
s∈S
subject to
12
(1)
X
ls ≥ q
s∈S
s
rs−1 y s ≤ l ≤ rs y s
X
ys ≤ 1
(2)
∀s ∈ S,
(3)
(4)
s∈S
l s ∈ <+
∀s ∈ S,
(5)
y s ∈ {0, 1}
∀s ∈ S.
(6)
Because of binary requirement upon y s variables and constraints (4) we must select only one
segment. Constraints (2) and (3) link the arc load, q, and the choice of the segment s such that
rs−1 ≤ q ≤ rs . As a result, the objective function (1) expresses g(q) by the fixed cost and slope
of the chosen segment.
Croxton et al. (2003a) proved that a piecewise linear cost function will be approximated by
its lower convex envelope when relaxing integrality constraints in the multiple choice model.
The lower convex envelope of a function g is the greatest convex function majorized by g (Rockafellar (1970, p. 36)). Croxton et al. derived this result for a second formulation (the convex
combination model), and then they extended it to the MCM and to a third MILP formulation
by proving the equivalence of their LP relaxations. Here we discuss a particular case of this
result. Croxton et al. observed that under concavity the lower convex envelope approximation
assumes a linear form: it is the line joining the origin with the point of maximum feasible flow.
We prove that this property holds under a milder assumption: i.e whenever the minimum αs
value is associated to the last segment.
Proposition 1 Let ŝ be such that αŝ = mins∈S αs . Then αŝ q is a lower bound for the LP relaxation of
the MCM model. If q ≤ rŝ then αŝ q represents also the optimal solution.
Proof — Since the binary requirement is relaxed and because of constraints (3), we will have
ls /rs ≤ y s ≤ ls /rs−1 in any feasible solution. Furthermore, the non-negative coefficients of y s
in the objective function (1) will lead, in an optimal solution, to the equality y s = ls /rs . We
P
can then project out the y s variables. The objective function becomes s∈S (cs ls + f s ls /rs ) =
P
s s
s∈S α l . Constraints (3) can be eliminated, constraint (2) and (5) hold, while constraint (4)
13
can be expressed as
X ls
≤ 1.
rs
(7)
s∈S
The resulting model is a continuous knapsack problem with αs assignment costs in addition
to constraint (7). It is immediate that αŝ q is a valid lower bound and, if the hypothesis q ≤ rŝ
holds, then the assignment of the total arc flow to the segment ŝ is feasible (constraint (7) is
satisfied), and αŝ q is the optimum, thus proving the proposition. Assumption 1 Assume that we have a PL cost function such that α|S| = mins∈S αs .
Observation 1 Under Assumption 1 and a feasible flow q, the optimal cost of the linear relaxation of
the MCM is equal to α|S| q and an optimal solution can be characterized as follows: the selected segment
is the last, l|S| = q, and y |S| = q/r|S| > 0, while ls = y s = 0, ∀s ∈ {1, ..., |S| − 1}.
Observation 2 Under Assumption 1 and a feasible flow q, the optimal dual multiplier associated to the
constraint (2) is equal to α|S| .
Proof — Let (π, µ) be the pair of dual multipliers associated to constraint (2) and (7), respectively. Note that the pair (α|S| , 0) is dual feasible and has a dual cost equal to α|S| q. Therefore,
by duality theory it is also dual optimal. Cost functions related to consolidation generally satisfy Assumption 1. To clarify this statement we introduce a characterization of PL cost functions induced by consolidation. It is certainly legitimate to assume that the cs values, segment slopes, are non-increasing in s. Furthermore, define the jumps of the fixed costs, ∆f s , and the length of the segments, ∆rs , as:
∆f s =
∆rs =



f s − f s−1
if s ∈ {2, ..., |S|}


f 1
if s = 1,



rs − rs−1
if s ∈ {2, ..., |S|}


r1
if s = 1.
The length of the segments where price discounts apply generally do not decrease, while the
fixed cost jumps do not increase, see Figure 3. This is summarized in the following Assumption:
14
Assumption 2 We assume a PL cost function such that the cs and ∆f s values are non-increasing in
the index s, while the ∆rs series is non-decreasing.
We now prove that the Assumption 2, motivated by consolidation legitimate features, implies
Assumption 1, and is thus stronger.
Proposition 2 Under Assumption 2, the αs series is non-increasing.
Proof — Because αs is the sum of cs and f s /rs terms and we use the non-increasing assumption on the cs values, it remains to prove that the f s /rs series is non-increasing as well, i.e.
f s+1 /rs+1 ≤ f s /rs , ∀s ∈ {1, ..., |S| − 1}. Observe that the non-increasing assumption on the
∆f s series and the non-decreasing ∆rs values, allow to write the following inequality:
f s + ∆f s+1
f s + ∆f s
f s+1
=
≤
.
rs+1
rs + ∆rs+1
rs + ∆rs
(8)
For s = 1 the inequality (8) becomes f 2 /r2 ≤ f 1 /r1 , as required. This leads to a proof by
induction. Let f s /rs ≤ f s−1 /rs−1 be the induction hypothesis. It is easy to verify that the
proposition is verified by using the induction hypothesis:
fs
f s−1
f s + ∆f s
2f s − f s−1
fs
≤
⇒
=
≤
.
rs
rs−1
rs + ∆rs
2rs − rs−1
rs
(9)
The above implication can be proved by contradiction negating the last inequality in (9):
2f s − f s−1
fs
fs
f s−1
>
⇒
>
.
2rs − rs−1
rs
rs
rs−1
(10)
In view of Proposition 2, we do not consider Assumption 1 to be limiting. The above results
allow us to handle more general non-convex cost functions that are not necessarily concave,
e.g. staircase functions, often found in practice.
4.2
Node-arc based formulation F1
The following additional notation is introduced to state the formulation. Let I be the set of
intermediate nodes between origins and destinations, i.e. I = N \ (O ∪ D). For each node i ∈ I
the set δ(i)+ represents the nodes j ∈ N such that (i, j) ∈ A. Similarly, δ(i)− represents the
nodes j ∈ N such that (j, i) ∈ A.
15
The decision variables of the node-arc based M++RP formulation, F1 , are:
• xkij ∈ {0, 1} , ∀(i, j) ∈ A, ∀k ∈ K, where xkij = 1 if shipment k uses arc (i, j), and xkij = 0
otherwise;
• Tik ∈ <+ , ∀k ∈ K, ∀i ∈ I, indicates the arrival time of shipment k at node i; Tik > 0 if the
node is visited by the shipment, otherwise Tik = 0.
Using this notation, the node-arc formulation can now be presented:
minimize
X
X
k∈K
(i,j)∈Av
ckij xkij +
X
X
(i,j)∈Apl
s∈Sij
s
s
(csij lij
+ fijs yij
)
(11)
subject to
X
xkok j = 1
∀k ∈ K,
(12)
xkidk = 1
∀k ∈ K,
(13)
∀k ∈ K, ∀i ∈ I,
(14)
∀k ∈ K, ∀(i, j) ∈ A ∩ I × I,
(15)
∀k ∈ K, ∀i ∈ N tw ,
(16)
∀k ∈ K, ∀i ∈ N d ,
(17)
∀(i, j) ∈ Apl ,
(18)
∀(i, j) ∈ Apl , ∀s ∈ Sij ,
(19)
∀(i, j) ∈ Apl ,
(20)
∀(i, j) ∈ Apl ,
(21)
∀k ∈ K, ∀i ∈ I,
(22)
s
lij
∈ <+
∀(i, j) ∈ Apl , ∀s ∈ Sij ,
(23)
s
yij
∈ {0, 1}
∀(i, j) ∈ Apl , ∀s ∈ Sij ,
(24)
xkij ∈ {0, 1}
∀(i, j) ∈ A, k ∈ K.
(25)
j∈Ω(k)∪{dk }
X
i∈Γ(k)∪{ok }
X
X
xkij −
xkji = 0
j∈δ − (i)
j∈δ + (i)
Tik + tkij − Tjk ≤ (1 − xkij )Mijk
X
X
xkij
xkij ≤ Tik ≤ bi
ai
j∈δ + (i)
j∈δ + (i)
X
Tik = di
xkij
j∈δ + (i)
X
s
lij
≥
s∈Sij
s−1 s
rij
yij
X
k k
qij
xij
k∈K
s
lij
s s
≤
≤ rij
yij
X
s
yij
≤1
s∈Sij
X
k k
wij
xij ≤ Wij
k∈K
Tik ∈ <+
Here Mijk assumes the value bi + tkij if i ∈ N tw , it is set equal to di + tkij if i ∈ N d , and it is
16
set equal to a sufficiently large number otherwise. The objective function (11) minimizes the
total cost incurred by using arcs with a shipment specific cost, i.e. arcs belonging to Av , and by
using consolidation services, arcs in Apl , were PL cost functions hold. Constraints (12) and (13)
define the degree of the origin and destination nodes, respectively. Flow conservation for the
remaining nodes is ensured by constraints (14). Propagation of time variables Tik is enforced
by constraints (15), while time windows and timetables are enforced by constraints (16) and
(17), respectively. Constraints (18) to (20) are similar to the MCM constraints (2) to (4). They
P
k xk , and the choice of the segment s of the arc cost function such that
link the arc load, k∈K qij
ij
P
s−1
k xk ≤ r s . Note that (19) is also useful to enforce capacity constraints over the
rij
≤ k∈K qij
ij
ij
P
k
k xk must not be greater than r |Sij | . The resource
quantities qij . In fact, the arc load k∈K qij
ij
ij
k values are modeled by constraints (21).
consumption upper bounds over wij
4.3
Path based formulation F2
A path based formulation focuses on paths between the origin and the destination nodes over
the digraph G. For each shipment k ∈ K, let P (k) represent the set of all feasible paths on
G. Each path must start from the origin node ok , end at the destination node dk and respect
the time related constraints. Concisely, a path satisfies constraints (12) to (17), and (25). For
each feasible path p ∈ P (k), we define a binary decision variable zpk , where zpk = 1 if and only if
S
shipment k is assigned to path p. Let P = k∈K P (k). We also introduce the following notation:
• φpij , ∀(i, j) ∈ A, ∀p ∈ P , φpij equals to one if the arc (i, j) belongs to the path p, and zero
otherwise.
• ckp =
k p
(i,j)∈Av cij φij , ∀p
P
∈ P, k ∈ K, thus ckp indicates the cost component of path p that is
not affected by consolidation.
Hence, the path based formulation F2 is the following:
minimize
X X
k∈K p∈P (k)
X
X
(i,j)∈Apl
s∈Sij
ckp zpk +
17
s
s
(csij lij
+ fijs yij
)
(26)
subject to
X
zpk = 1
∀k ∈ K,
(27)
k p k
qij
φij zp
∀(i, j) ∈ Apl ,
(28)
k p k
wij
φij zp ≤ Wij
∀(i, j) ∈ Apl ,
(29)
∀(i, j) ∈ Apl , ∀s ∈ Sij ,
(30)
∀(i, j) ∈ Apl ,
(31)
s
lij
∈ <+
∀(i, j) ∈ Apl , ∀s ∈ Sij ,
(32)
s
yij
∈ {0, 1}
∀(i, j) ∈ Apl , ∀s ∈ Sij ,
(33)
zpk ∈ {0, 1}
∀k ∈ K, p ∈ P (k).
(34)
p∈P (k)
X
s
lij
≥
s∈Sij
X X
k∈K p∈P (k)
X X
k∈K p∈P (k)
s−1 s
s
s s
yij
rij
yij ≤ lij
≤ rij
X
s
yij
≤1
s∈Sij
One must choose exactly one path for each shipment, as stated in constraints (27). Constraints
k values are modeled by con(28) play the same role as constraints (18). Capacities over the wij
straints (29). Constraints (30) and (31) are similar to (19) and (20).
4.4
Approximated path based formulation F2 L
Here we introduce a simplified formulation where the PL cost functions are replaced by linear
s . We call this formulation F L:
costs αij = mins∈Sij αij
2
minimize
X X
(ckp +
X
k p
αij qij
φij )zpk
(35)
(i,j)∈Apl
k∈K p∈P (k)
subject to
X
zpk = 1
∀k ∈ K,
(36)
k p k
qij
φij zp ≤ rij ij
|S |
∀(i, j) ∈ Apl ,
(37)
k p k
wij
φij zp ≤ Wij
∀(i, j) ∈ Apl ,
(38)
∀k ∈ K, p ∈ P (k).
(39)
p∈P (k)
X X
k∈K p∈P (k)
X X
k∈K p∈P (k)
zpk ∈ <+
18
Where constraints (37) define the upper bounds over the arc flows, and constraints (36) and (38)
are similar to (27) and (29), respectively. The merits of F2 L are discussed in the next section.
4.5
Lower bounds strength
Let LB1 denote the lower bound obtained by relaxing the integrality constraints (24) and (25)
in F1 , while LB2 is the the lower bound when relaxing (33) and (34) in F2 . We indicate by V2L
the optimal solution value of F2 L. Afterwards, we present the two following propositions.
Proposition 3 V2L ≤ LB2
Proof — Note that a feasible solution z for F2 L is feasible for the F2 linear relaxation, and vice
versa. The two objective functions (26) and (35) differ for the PL related costs. In (26) the PL
costs derive from the MCM, whereas in (35) they are expressed as αij qij values. As noted in
Proposition 1, αij qij is a lower bound of the corresponding relaxed MCM, thus proving the
inequality. |S |
s = α ij , ∀(i, j) ∈ Apl , then V
Proposition 4 If αij = mins∈Sij αij
2L = LB2 ≥ LB1
ij
Proof — The equality V2L = LB2 is a consequence of Proposition 3 and Observation 1. The
inequality LB2 ≥ LB1 derives from the decomposition choice. A path p satisfies constraints
(12) to (17), and (25). This set of constraints does not exhibit the integrality property. In fact,
the (12) to (17) polytope has a polynomial number of constraints while describing the feasible
region of an N P-hard problem, the SPPTW. Therefore (Geoffrion, 1974), we can expect that
solving the LP relaxation of formulation F2 will yield tighter lower bounds than solving that
of formulation F1 . Since the formulation F2 L is considerably more compact than F2 , the previous result highlights the convenience of using the approximate formulation for bounding purposes. When
Assumption 1 holds, the strength of the lower bound is not compromised by solving the compact formulation. Furthermore, F2 L could also prove its usefulness when the above mentioned
assumption does not apply. This would be the case when very large instances could not be
tackled alternatively.
4.6
Valid inequalities
We adapt to formulations F1 and F2 the valid inequalities proposed by Croxton et al. (2007).
We start with the valid inequalities for formulation F1 . The strong forcing constraints state that
19
when on a given arc no segment is chosen, the flow of each shipment is zero on that arc. With
our notation:
X
xkij ≤
s
yij
∀k ∈ K, ∀(i, j) ∈ Apl .
(40)
s∈Sij
The strong forcing constraints (40) aggregate segments while disaggregating shipments. Introducing additional non-negative variables xks
ij allows us to disaggregate both shipments and
segments. These new variables are related to the previous ones by the following equations:
X
xkij =
xks
ij
∀k ∈ K, ∀(i, j) ∈ Apl ,
(41)
∀(i, j) ∈ Apl , ∀s ∈ Sij .
(42)
s∈Sij
s
lij
≥
X
k ks
qij
xij
k∈K
The extended forcing constraints can be stated as:
s
xks
ij ≤ yij
∀(i, j) ∈ Apl , ∀k ∈ K, ∀s ∈ Sij .
(43)
We call F1 S the formulation obtained by adding constraints (40) to F1 , while F1 E is the formulation incorporating constraints (41) to (43) and the variables xks
ij .
The equivalent strengthened F2 formulation, F2 S, would require constraints (40) to be reformulated in the z variables:
X
p∈P (k)
φpij zpk −
X
s
yij
≤0
∀k ∈ K, ∀(i, j) ∈ Apl .
(44)
s∈Sij
Croxton et al. (2007) proved that the extended forcing constraints describe the convex hull
of the Lagrangean subproblem when relaxing flow conservation constraints in network flow
problems with piecewise linear costs. Frangioni and Gendron (2007) established the equivalence between extended forcing constraints and residual capacity inequalities. However, computational experiments of Croxton et al. (2007) indicate that when there are relatively large initial fixed costs the extended forcing constraints do not significantly improve upon the strong
forcing ones.
20
5
Column Generation Algorithms
The column generation approach solves a large linear program (LP) without explicitly including all columns, i.e. variables, in the constraint matrix. The full problem is called the master
problem (MP), while the LP with only a subset of the MP columns is called the restricted master
problem (RMP). In most problems, only a very small subset of all columns will belong to an
optimal solution, and all other (non-basic) columns can be ignored. In a minimization problem
all columns with positive reduced cost can in fact be ignored. The column generation algorithm finds an optimal solution to the MP by solving a series of several smaller RMPs. When
the optimal solution of an RMP is found, the algorithm looks for columns of negative reduced
cost not included in the RMP. This is called the pricing problem. If no column can be found
by the pricing routine, then the current optimal solution of the RMP is also optimal for the MP.
Otherwise, one or more columns with negative reduced costs are added to the RMP and the
algorithm iterates.
We present three algorithms to obtain lower bounds for our problem:
• CG solves the linear relaxation of F2 by column generation;
• CGL solves F2 L by column generation;
• CGS solves the linear relaxation of F2 S by row and column generation.
Finally, feasibility is reached by a branch-and-cut algorithm that chooses paths between those
generated by one of these three algorithms.
5.1
F2 linear relaxation, algorithm CG
We devise the following pricing routine to find paths, i.e. columns, with negative reduced cost
for the formulation F2 . Let σk , ∀k ∈ K, πij ≥ 0, ∀(i, j) ∈ Apl , and λij ≤ 0, ∀(i, j) ∈ Apl be dual
multipliers associated with constraints (27), (28), and (29) respectively. The reduced cost of the
variable zpk , denoted as c̄kp , is:
c̄kp =
X
(i,j)∈Av
ckij φpij −
X
(i,j)∈Apl
21
k
k
φpij (wij
λij − qij
πij ) − σk .
(45)
We introduce modified arc costs, c̃kij , over the digraph G, as follows:
c̃kij
=



k π − wk λ
qij
ij
ij ij
if (i, j) ∈ Apl ,


ckij
otherwise.
Given this modified cost structure, every path p ∈ P (k) will have a cost c̃kp equal to c̄kp + σk .
Therefore, in order to find a column with a negative reduced cost we look for a path p such that
c̃kp < σk . This can be accomplished by solving for each shipment a SPPTW over the digraph G
with the cost structure modified by the current dual multipliers. Observe that the modified arc
costs are non-negative. The pricing algorithm is detailed in Section 5.4.
At the first iteration of the algorithm we insert in the F2 formulation the paths (ok , dk ), i.e.
the direct origin-destination arcs. This brings noteworthy advantages: we avoid using artificial variables with “big M” costs to initialize the process, and the corresponding first LP is
immediately solved, thus providing “good” dual values. Therefore, at the first iteration the σk
dual multipliers assume value cok dk , while the other dual multipliers associated to PL related
constraints are equal to zero because these (ok , dk ) paths do not use consolidation arcs. Consequently, the pricing algorithm looks for paths that cost less than the direct origin-destination
arcs. However, it uses a modified digraph cost structure where the arcs with PL cost functions
are seen as arcs at zero cost. As we will see in the next section, this undesiderable behavior is
intrinsically avoided by algorithm CGL.
5.2
Solving F2 L, algorithm CGL
The CGL algorithm uses a modified pricing problem. Let ηij ≤ 0, ∀(i, j) ∈ Apl be dual multipliers associated with constraints (37). These dual variables, together with the λ and σ introduced
earlier, allow us to write the reduced cost of the variable zpk , c̄kp , in the formulation F2 L, as:
c̄kp =
X
ckij φpij −
(i,j)∈Av
X
k
k
φpij (qij
(ηij − αij ) + wij
λij ) − σk .
(i,j)∈Apl
Let the dual modified arc costs, c̃kij , be:
c̃kij =



k (η − α ) − w k λ
−qij
ij
ij
ij ij
if (i, j) ∈ Apl ,


ckij
otherwise.
22
(46)
Similarly to the previously defined pricing scheme in CG, we use an SPPTW algorithm to find
paths p such that c̃kp < σk , i.e. columns with negative reduced cost. Observe the difference
between the dual modified digraph cost structure of CGL respect to the one of CG. In CGL we
k α , ∀(i, j) ∈ Apl , and the equality is attained when the η and λ dual multipliers
have c̃kij ≥ qij
ij
are equal to zero. As discussed in the previous section, this happens at the first iteration of
the column generation. Here the pricing phase of CGL uses a more realistic cost structure. In
fact, recalling Observation 2, it is clear that when dealing with PL cost functions that satisfy
Assumption 1, the πij dual multipliers in CG converge to αij . Therefore, CGL, where the αij
values play a role since the beginning, has a faster overall convergence. The advantage of F2 L
is obvious: it implicitly exploits the knowledge about the optimal dual multipliers πij . This fact
leads, together with the smaller size of the LPs, to the better performance of CGL compared to
CG, as we will show in our computation experiments.
5.3
F2 S linear relaxation, algorithm CGS
Solving the F2 S linear relaxation requires larger LPs than in CG because of constraints (44).
These constraints, when dealing with very large instances, cannot all be introduced because of
excessive memory requirements. Therefore, we resort to a column and row generation algorithm, where the generated rows are the violated constraints (44). The algorithm CGS consists
of two phases. During the first phase, CGS-P1, row generation does not occur. CGS-P1 consists of an improved version of the algorithm CG. The algorithm CGS-P1 differs from CG by
exploiting the knowledge about the optimal dual variables π, i.e. the πij multipliers are set to
αij . In addition, the algorithm CGS-P1 is useful to assess the improvement due to the dual
information. At the end of CGS-P1 the second phase, CGS-P2, starts and violated inequalities
are searched by complete enumeration and appended, if they exist, to the current model. Afterward, a new column generation procedure, which we now outline, is called. The algorithm iterates until no new violated inequalities are found. The column generation procedure for CGS-P2
differs from the one of CG by considering the dual multipliers associated with generated conk ≤ 0, ∀k ∈ K, ∀(i, j) ∈ Apl .
straints, i.e. a subset of (44). We indicate these dual multipliers by νij
Consequently, the reduced cost of the variable zpk , c̄kp , are:
c̄kp =
X
(i,j)∈Av
ckij φpij −
X
k
k
k
φpij (wij
λij − qij
πij + νij
) − σk .
(i,j)∈Apl
23
(47)
The modified arc costs, c̃kij , can be expressed as:
c̃kij
5.4
=



k π − wk λ − ν k
qij
ij
ij ij
ij
if (i, j) ∈ Apl ,


ckij
otherwise.
Solving the pricing problem
This section describes how the pricing problems in our column generation algorithms are
solved. As described earlier, because of our representation of timetables as collapsed time
windows, our pricing problem is an SPPTW. We are given a directed graph where each arc has
an associated cost cij and travel time tij . Each node i has a time window [ai , bi ]. We are allowed
to arrive to a node i before ai , but we must wait until time ai before the node can be processed
and the node cannot be visited after bi . The objective of the SPPTW is to construct a minimum
cost path (with respect to the arc costs cij ) from a start node ok to an end node dk such that time
windows are respected. We assume that travel times are non-negative, but in general we do not
make assumptions on the travel costs (they could be negative). Negative travel costs can lead
to cycles in the optimal shortest path but as long as the travel time on all cycles are positive the
optimal shortest path is finite. In the shortest path problem considered in this paper we know
that all arc costs are non-negative. This means that the shortest paths returned from an SPPTW
algorithm are without loops. Consequently nothing is gained by considering the more difficult
elementary shortest path algorithms.
Labeling algorithms for solving the SPPTW are described by Desrochers and Soumis (1988),
Feillet et al. (2004), and Irnich and Desaulniers (2005). Such algorithms build partial paths starting from the start-node s. Each partial path (ok , i1 , . . . , in ) is represented by a label [in , c, t, h]
where in is the end node of the partial path, c is the cost of the partial path, t is the arrival
time at node in , and h is a pointer to the parent label (the label corresponding to the path
(ok , i1 , . . . , in−1 )). The algorithm maintains two sets of labels: processed and unprocessed labels.
The algorithm starts with a single label [ok , 0, 0, NULL] in the set of unprocessed labels. At each
iteration the algorithm considers a label [i, c, t, h] from the set of unprocessed labels, and this
label is extended by considering all arcs originating in node i. Extending the label to node j
results in the label [j, c + cij , max{t + tij , aj }, h0 ] where h0 is a pointer to the original label. If
max{t + tij , aj } > bj then the label is discarded as it corresponds to a partial path that violates the time window. The new generated labels are moved to the set of unprocessed labels
24
while the label we extended are moved to the set of processed labels. The algorithm terminates
when the set of unprocessed labels is empty. In this case the label corresponding to the shortest
path from ok to dk is the label associated with node dk that has the lowest cost. The algorithm
described so far enumerates all feasible paths. To speed up the algorithm dominance rules are
introduced. If we have two labels [i, c, t, h] and [i0 , c0 , t0 , h0 ] such that i = i0 , c ≤ c0 and t ≤ t0 then
the second label is dominated and can be discarded. In fact, all possible ways of completing
the partial path corresponding to the second label can be used to complete the partial path corresponding to the first label. As a consequence, the partial path corresponding to the first label
leads to completions with cost at least as good as the completions of the partial path associated
to the second label.
In this paper we use the bidirectional algorithm proposed by Righini and Salani (2006).
This algorithm improves the algorithm outlined above by extending paths from both the start
node ok and the end node dk simultaneously. Informally speaking, the algorithm stops when
the paths from the start node meets with paths from the end node. The algorithm is described
in detail by Righini and Salani (2006). It provides significant speedups compared with the
unidirectional algorithm that only extends paths from the start node (see results in Righini and
Salani (2006)).
In each iteration of the column generation process we are interested in generating variables
(columns, paths) with negative reduced costs. It is not necessary to generate the variable that
has the most negative reduced cost. Consequently we are free to use heuristics for generating
the variables. We only need to solve the pricing problem to optimality to prove that no variable
with negative reduced cost exists, that is, the exact algorithm needs only be called when the
heuristic algorithm no longer finds variables with negative reduced costs. It is well known
(e.g., Irnich and Desaulniers (2005)) that the labeling algorithm for the shortest path problem
easily can be turned into a heuristic. However, the pricing problems that we encounter are all
relatively easy and therefore, for simplicity, we always use the exact algorithm.
5.5
Heuristics to obtain upper bounds, algorithms H-CG, H-CGL, H-CGS
The column generation algorithms outlined above will usually yield a fractional solution, meaning that some shipments use more than one path. We propose a very simple heuristic scheme
aimed at obtaining a feasible solution. It consists of performing a time limited branch-and-cut
search on the set of generated columns. This amounts to solving the final RMP as an integer
25
program. By doing so, we turn the column generation algorithms into heuristics. We use a
state-of-the-art MILP solver for this branch-and-cut algorithm. When reporting computational
results we will indicate these heuristics as H-CG, H-CGL, H-CGS. Note that the branch-and-cut
for H-CGL is initiated by loading the generated columns into the F2 formulation, since F2 L is
not equivalent to F2 when converted to an integer program.
In general the outlined heuristic approach fails because there is no guarantee of integer feasibility in the RMP. However, in our problem it is always possible to get a feasible integer solution because of the direct origin-destination arc for each shipment. As described previously,
these paths are inserted at the beginning of the column generation, and hence they belong to
the set of paths on which the branch-and-cut algorithm is applied. Therefore, their usefulness is
twofold: they ensure faster convergence when computing a lower bound (by avoiding the “big
M” to initialize the process), and they guarantee integer feasibility when looking heuristically
for an upper bound.
6
Computational Results
We now present computational experiments. We first describe how test instances were generated, and we then provide results obtained with the various algorithms.
6.1
Generation of test instances
As mentioned in Section 1, this study was motivated by a real-life application. The largest customer requires shipments from 10 factories located in Northern Italy to 10 regional distribution
warehouses in Central-South Italy. The planning horizon is two weeks and 28 block trains can
be activated. There are five train stations where loading can take place and four arrival stations
where these block trains end. Once arrived at one of these four stations the shipment can be
sent to its final destination by truck. Two of the four arrival stations offer the opportunity to
re-route the shipment by rail to seven other train stations. This additional set of railway links
has a cost structure similar to a dedicated service, i.e. there are no quantity discounts and the
cost is per railcar. The available mode and consolidation combinations yield six alternatives:
1. dedicated origin-destination service by truck;
2. consolidated service by truck to an intermediate platform; final link by dedicated truck
service;
26
3. consolidated service by truck to a loading train station; consolidation by block train to
arrival station; final link by dedicated truck service;
4. as option three, but with an additional step before the last link: a dedicated rail service
between two compatible train stations;
5. as option three, but with dedicated truck service between factory and loading train station;
6. dedicated truck service between factory and loading train station; consolidation by block
train to intermediate station; dedicated rail service to arrival station; final link by dedicated truck service.
With 122 shipments in two weeks, pickup and delivery time windows, virtual nodes and arcs
to represent mode or vehicle transfers, etc. the resulting digraph has a rather large size: |N | =
1507, |A| = 4900, and |Apl | = 2094. Instances of this size are clearly beyond the capabilities
of exact approaches. To test the efficacy of our algorithms we created smaller instances that
are subsets of the real-life one with |K| equal to 10, 30, and 60. The instance set contains three
instances for each choice of |K|, labeled as i-10-01, i-10-02, i-10-03, etc. and the realistic sized
instance denoted as i-122-01.
6.2
Implementation details and results
The algorithms were implemented in C++ using CPLEX 10.0. Computational experiments were
run on a 2.5 GHz Pentium IV Linux server with 4 GB of memory. When reporting the experiments with the three formulations F1 , F1 S, and F1 E implemented in CPLEX we will denote
their results by the name of the formulation, i.e. F1 will indicate the formulation as well as its
CPLEX implementation. We have not tweaked the CPLEX parameters. We have only set time
limits: 10 hours for the F1 , F1 S, and F1 E algorithms, 15 minutes for H-CG2 and H-CGL, and
30 minutes for H-CGS.
We first compare in Table 1 results by F1 and H-CGL. H-CGL clearly outperforms F1 by
taking considerably less computational time while obtaining better upper bounds than the 10
hours of the truncated CPLEX branch-and-cut upon F1 . Note that whenever F1 requires less
than the time limit, it means that an optimal solution has been found. On these instances,
solved to optimality, H-CGL obtains the same solution value, but in a shorter time. Results
27
on the real-life i-122-01 instance are worth a remark: H-CGL provides a 10 percentage points
better solution by using three orders of magnitude less computational time than F1 .
Table 2 clarifies why we have chosen to benchmark H-CGL with F1 . This formulation is the
only one able to handle the large instance with |K| = 122. In fact, because of memory limits
F1 S manages to load instances up to |K| = 60 while the heaviest F1 E stops at |K| = 30. While
examining the lower bounds obtained at the root nodes (results that we do not report here),
F1 S improves over F1 but with a slightly longer computational time. As Table 2 shows, the
better bounding capabilities of the strong forcing constraints do not uniformly guarantee better overall performance, due to slower node examination. In spite of their theoretical strength,
the extended forcing constraints in F1 E are disappointing: they yield a lower bound improvement similar to that of the strong forcing constraints but with computational time larger by an
order of magnitude. Consequently, F1 E results are dominated by those of F1 and F1 S. Customized branch-and-cut algorithms based on the discussed valid inequalities would have been
more competitive than our straightforward implementation. However, results on the small instances, |K| = 30, indicate that the lower bound improvement is not enough to compensate the
additional computational burden. Furthermore, this improvement is already captured by the
more compact strong forcing constraints. In fact, Croxton et al. (2007) report that when there
are relatively large initial fixed costs the extended formulation does not significantly improve
upon the strong one. This is so in our case and it explains why we did implement a column
and row generation algorithm based upon the strong forcing constraints only.
We now discuss the advantages of using the F2 L formulation to compute lower bounds.
Table 3 reports the computational times of three algorithms: CG, CGS-P1, and CGL. These three
algorithms compute the same lower bound. As expected, CG is the slowest. The algorithm
CGS-P1 improves in average with respect to CG because it exploits the knowledge about the
optimal π dual values. The algorithm CGL considerably improves with respect to CGS-P1
because of the compactness of the corresponding formulation. These two combined effects
lead to CGL being 40 times faster in average than CG.
The performance of the three column generation based heuristics are summarized in Table
4. The merits of the more compact F2 L formulation can be further appreciated. The algorithm
H-CGL is better not only because CGL has a faster column generation convergence. In addition,
the heuristic phase, i.e. the branch-and-cut search upon the set of generated columns, is more
effective in H-CGL. This happens because H-CG receives a larger set of columns from CG than
28
H-CGL does from CGL. However, these additional paths are not useful, because, as observed
in Section 5.1, they are generated with a zero cost estimate of the PL functions at the beginning
of the CG algorithm. The H-CGS algorithm corrects this unfavorable characteristic. However,
it does not obtain appreciably better results, even within an extended time limit of 30 minutes.
Table 5 assesses the merit of the pricing algorithm. We indicate by H-CGL-LSA1 a modification of the H-CGL algorithm where the label setting routine returns only one path, if any. The
computational results indicate that this modification considerably degrades solution quality although it yields a faster algorithm. The algorithm H-CGL-SPCPLEX replaces the label setting
routine by a MILP model solved by CPLEX. Here both solution quality and computational time
worsen. Moreover, memory limit is reached by the medium and large instances.
The algorithm CGS proves its usefulness when computing lower bounds. In Table 6 we
report lower bounds provided by F1 , F1 S and F1 E that are obtained at the termination of the
branch-and-cut algorithms. The computational times to compute these lower bounds are the
ones of Table 2. The last two columns of Table 6 provide the results of the CGS algorithm.
In spite of the relatively small computational times, the lower bounds provided by CGS are
of high quality, since the algorithm obtains three times the best lower bound. Finally, using
these lower bounds, the solution quality of the heuristic can be assessed. In the worst case the
solution values produced by using H-CGL lie within a few percentage points from optimality.
7
Conclusions
We have described, formulated and solved a new and feature-rich routing problem. The construction of an appropriate network representation turned out being non-trivial. We have devised a solution approach that exploits specific problem characteristics like the cost function
properties and the realistic upper bounds upon feasible paths. Computational experiments
showed the efficacy of the proposed heuristic algorithm based on decomposition.
8
Acknowledgements
This work was partly supported by the MUR (Italy) under project PILOT, and by the Natural Sciences and Engineering Council of Canada under grants 227837-00 and 39682-05. These
supports are gratefully acknowledged. Thanks are also due to Manlio Gaudioso, M. Flavia
Monaco, and Gabriella Messina for fruitful discussions.
29
References
Aldaihani, M. and Dessouky, M. M. (2003). Hybrid scheduling methods for paratransit operations. Computers & Industrial Engineering, 45(1):75–96.
Amiri, A. and Pirkul, H. (1997). New formulation and relaxation to solve a concave-cost network flow problem. Journal of the Operational Research Society, 48:278–287.
Barnhart, C., Hane, C. A., and Vance, P. H. (2000). Using branch-and-price-and-cut to solve
origin-destination integer multicommodity flow problems. Operations Research, 48(2):318–
326.
Barnhart, C. and Ratliff, H. D. (1993). Modeling intermodal routing. Journal of Business Logistics,
14(1):205 – 223.
Boardman, B. S., Malstrom, E. M., Butler, D. P., and Cole, M. H. (1997). Computer assisted
routing of intermodal shipments. Computers & Industrial Engineering, 33(1-2):311–314.
Bontekoning, Y. M., Macharis, C., and Trip, J. J. (2004). Is a new applied transportation research
field emerging?–A review of intermodal rail-truck freight transport literature. Transportation
Research Part A: Policy and Practice, 38(1):1–34.
Caramia, M., Dell’Olmo, P., Gentili, M., and Mirchandani, P. B. (2007). Delivery itineraries and
distribution capacity of a freight network with time slots. Computers & Operations Research,
34(6):1585–1600.
Chang, T.-S. (2007). Best routes selection in international intermodal networks. Computers &
Operations Research, In Press, Corrected Proof.
Crainic, T. G. (2000). Service network design in freight transportation. European Journal of
Operational Research, 122(2):272 – 288.
Crainic, T. G., Florian, M., Guélat, J., and Spiess, H. (1990). Strategic planning of freight transportation: STAN, an interactive-graphic system. Transportation Research Record, 1283:97–124.
Crainic, T. G., Gendron, B., and Hernu, G. (2004). A slope scaling/Lagrangean perturbation
heuristic with long-term memory for multicommodity capacitated fixed-charge network design. Journal of Heuristics, 10(5):525–545.
30
Crainic, T. G. and Kim, K. H. (2006). Intermodal transportation. In Barnhart, C. and Laporte, G.,
editors, Transportation, volume 14 of Handbooks in Operations Research and Management Science,
chapter 8, pages 467–537. Elsevier.
Crainic, T. G. and Laporte, G. (1997). Planning models for freight transportation. European
Journal of Operational Research, 97(3):409 – 438.
Croxton, K. L., Gendron, B., and Magnanti, T. L. (2003a). A comparison of mixed-integer programming models for nonconvex piecewise linear cost minimization problems. Management
Science, 49(9):1268 – 1273.
Croxton, K. L., Gendron, B., and Magnanti, T. L. (2003b). Models and methods for merge-intransit operations. Transportation Science, 37(1):1 – 21.
Croxton, K. L., Gendron, B., and Magnanti, T. L. (2007). Variable disaggregation in network
flow problems with piecewise linear costs. Operations Research, 55(1):146 – 157.
Desaulniers, G., Desrosiers, J., and Solomon, M. M., editors (2005). Column Generation. Springer,
Boston.
Desrochers, M. and Soumis, F. (1988). A generalized permanent labeling algorithm for the
shortest path problem with time windows. INFOR, 26:191–212.
Feillet, D., Dejax, P., Gendreau, M., and Gueguen, C. (2004). An exact algorithm for the elementary shortest path problem with resource constraints: Application to some vehicle routing
problems. Networks, 44:216–229.
Frangioni, A. and Gendron, B. (2007). 0-1 reformulations of the multicommodity capacitated
network design problem. Technical Report 29, CIRRELT — Centre interuniversitaire de
recherche sur les réseaux d’entreprise, la logistique et le transport.
Gendron, B., Crainic, T., and Frangioni, A. (1998). Multicommodity capacitated network design. In Sansò, B. and Soriano, P., editors, Telecommunications Network Planning, pages 1–19.
Kluwer, Boston.
Geoffrion, A. M. (1974). Lagrangean relaxation for integer programming. Mathematical Programming Study, 2:82–113.
31
Guélat, J., Florian, M., and Crainic, T. G. (1990). A multimode multiproduct network assignment model for strategic planning of freight flows. Transportation Science, 24(1):25 – 39.
Ibaraki, T., Imahori, S., Kubo, M., Masuda, T., Uno, T., and Yagiura, M. (2005). Effective local search algorithms for routing and scheduling problems with general time-window constraints. Transportation Science, 39(2):206–232.
Irnich, S. and Desaulniers, G. (2005). Shortest path problems with resource constraints. In
Desaulniers, G., Desrosiers, J., and Solomon, M. M., editors, Column Generation, chapter 2,
pages 33–65. Springer, Boston.
Jourquin, B., Beuthe, M., and Demilie, C. L. (1999). Freight bundling network models: Methodology and application. Transportation Planning and Technology, 23:157–177.
Keha, A. B., de Farias, I. R., and Nemhauser, G. L. (2004). Models for representing piecewise
linear cost functions. Operations Research Letters, 32(1):44–48.
Keha, A. B., de Farias, I. R., and Nemhauser, G. L. (2006). A branch-and-cut algorithm without
binary variables for nonconvex piecewise linear optimization. Operations Research, 54(5):847–
858.
Kim, D. and Pardalos, P. M. (1999). A solution approach to the fixed charge network flow
problem using a dynamic slope scaling procedure. Operations Research Letters, 24(4):195–203.
Kim, D. and Pardalos, P. M. (2000). Dynamic slope scaling and trust interval techniques for
solving concave piecewise linear network flow problems. Networks, 35(3):216–222.
Macharis, C. and Bontekoning, Y. (2004). Opportunities for OR in intermodal freight transport
research: A review. European Journal of Operational Research, 153(2):400 – 416.
Min, H. (1991). International intermodal choices via chance-constrained goal programming.
Transportation Research Part A: Policy and Practice, 25(6):351–362.
Righini, G. and Salani, M. (2006). Symmetry helps: bounded bi-directional dynamic programming for the elementary shortest path problem with resource constraints. Discrete Optimization, 3(3):255–273.
Rockafellar, R. T. (1970). Convex Analysis. Princeton University Press, Princeton.
32
Southworth, F. and Peterson, B. E. (2000). Intermodal and international freight network modeling. Transportation Research Part C: Emerging Technologies, 8(1-6):147–166.
Toth, P. and Vigo, D. (2002). The Vehicle Routing Problem. SIAM Monographs on Discrete Mathematics and Applications, Philadelphia.
Xu, H., Chen, Z.-L., Rajagopal, S., and Arunapuram, S. (2003). Solving a practical pickup and
delivery problem. Transportation Science, 37(3):347–364.
Ziliaskopoulos, A. and Wardell, W. (2000). An intermodal optimum path algorithm for multimodal networks with dynamic arc travel times and switching delays. European Journal of
Operational Research, 125(3):486 – 502.
33
i-10-01
i-10-02
i-10-03
i-30-01
i-30-02
i-30-03
i-60-01
i-60-02
i-60-03
i-122-01
Average
H-CGL
Sol. Value Time (min.)
100.0
0.2
100.0
0.1
100.0
0.4
100.0
0.1
100.0
0.2
100.0
15.0
100.0
0.5
100.0
2.5
100.0
15.0
100.0
0.7
100.0
3.5
Sol. Value
100.0
100.0
100.0
136.3
100.9
100.0
113.6
113.6
100.3
110.0
107.5
F1
Time (min.)
600.0
2.5
2.6
600.0
600.0
246.7
600.1
600.1
600.3
600.4
445.3
Table 1: Computational results with F1 and H-CGL. The solution values are scaled to 100,
and bold entries correspond to the best entry for each row, for computational time as well as
solution quality.
i-10-01
i-10-02
i-10-03
i-30-01
i-30-02
i-30-03
i-60-01
i-60-02
i-60-03
Average
Sol. Value
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.5
100.1
F1
Time (min.)
600
3
3
600
600
247
600
600
600
428
F1 S
Sol. Value Time (min.)
100.0
18
100.0
1
100.0
4
100.0
600
105.0
600
100.0
600
100.0
600
105.6
600
100.0
600
101.2
403
Sol. Value
100.0
100.0
100.0
100.0
152.3
100.0
n.a.
n.a.
n.a.
F1 E
Time (min.)
600
43
30
600
600
600
n.a.
n.a.
n.a.
Table 2: Computational results with F1 , F1 S and F1 E. The solution values are scaled to 100.
i-10-01
i-10-02
i-10-03
i-30-01
i-30-02
i-30-03
i-60-01
i-60-02
i-60-03
i-122-01
Average
CG (sec.)
7.8
10.9
57.4
26.7
39.2
141.7
116.7
167.2
541.2
667.7
CGS-P1 (sec.)
4.0
7.0
48.7
23.9
34.6
223.5
85.2
128.5
843.5
548.2
CGL (sec.)
0.1
0.2
0.7
1.1
1.6
12.2
3.2
5.2
29.8
12.0
CG/CGS-P1
2.0
1.5
1.2
1.1
1.1
0.6
1.4
1.3
0.6
1.2
1.2
CGS-P1/CGL
28.3
36.9
70.6
22.5
21.6
18.3
26.6
24.7
28.3
45.7
32.3
CG/CGL
55.5
57.2
83.1
25.2
24.5
11.6
36.4
32.1
18.1
55.6
39.9
Table 3: Computational times of three column generation algorithms: CG, CGS-P1, CGL.
34
i-10-01
i-10-02
i-10-03
i-30-01
i-30-02
i-30-03
i-60-01
i-60-02
i-60-03
i-122-01
Average
H-CGL
Sol. Value Time (min.)
100.0
0.2
100.0
0.1
100.0
0.4
100.0
0.1
100.0
0.2
100.0
15.0
100.0
0.5
100.1
2.5
100.0
15.0
100.0
0.7
100.0
3.5
H-CG
Sol. Value Time (min.)
100.0
15.0
100.0
3.3
100.0
15.0
100.4
15.0
100.0
15.0
100.0
15.0
101.1
15.0
100.4
15.0
102.1
15.0
104.0
15.0
100.8
13.8
H-CGS
Sol. Value Time (min.)
100.0
28.2
100.0
3.8
100.0
30.0
100.0
7.9
100.0
4.0
100.0
30.0
100.4
30.0
100.0
30.0
106.2
30.0
102.8
30.0
100.9
22.4
Table 4: Computational results with the three heuristic algorithms. The solution values are
scaled to 100, and bold entries correspond to the best entry for each row, for computational
time as well as solution quality.
i-10-01
i-10-02
i-10-03
i-30-01
i-30-02
i-30-03
i-60-01
i-60-02
i-60-03
i-122-01
Average
H-CGL
Sol. Value Time (min.)
100.0
0.2
100.0
0.1
100.0
0.4
100.0
0.1
100.0
0.2
100.0
15.0
100.0
0.5
100.0
2.5
100.0
15.0
100.0
0.7
100.0
3.5
H-CGL-LSA1
Sol. Value Time (min.)
104.8
0.01
101.9
0.01
104.2
0.02
116.6
0.02
112.7
0.02
116.8
0.05
104.7
0.07
103.3
0.08
103.7
0.43
103.4
0.29
107.2
0.10
H-CGL-SPCPLEX
Sol. Value Time (min.)
101.0
0.6
101.9
0.8
101.1
4.5
112.0
10.7
n.a.
10.0
n.a.
10.0
n.a.
10.0
n.a.
10.0
n.a.
10.0
n.a.
10.0
Table 5: Assessment of the pricing algorithm. The solution values are scaled to 100, and bold
entries correspond to the best entry for each row, for computational time as well as solution
quality.
35
i-10-01
i-10-02
i-10-03
i-30-01
i-30-02
i-30-03
i-60-01
i-60-02
i-60-03
i-122-01
Average
H-CGL/best LB
100.1
100.0
100.0
109.7
104.3
100.1
108.0
106.1
100.9
105.9
103.5
F1
95.6
100.0
100.0
93.3
100.0
99.5
93.6
97.3
100.0
87.9
96.7
F1 S
100.0
100.0
100.0
94.0
98.9
100.0
94.8
100.0
100.0
n.a.
98.6
F1 E
96.1
100.0
100.0
92.7
95.7
100.0
n.a.
n.a.
n.a.
n.a.
97.4
CGS
84.7
82.3
75.5
100.0
91.4
83.0
100.0
97.9
86.6
100.0
90.1
CGS Time (min.)
0.2
0.1
1.7
0.6
1.0
7.5
2.2
3.4
26.9
13.7
5.7
Table 6: H-CGL solution quality compared with the best known lower bounds. The lower
bound values are scaled to 100, and bold entries correspond to the best entry for each row.
36
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement