Load Balancing

Published on May 2016 | Categories: Documents | Downloads: 40 | Comments: 0 | Views: 330

of 16

load balancing

Content

&

(19)

(11)

EP 2 006 770 A1

EUROPEAN PATENT APPLICATION

(12)
(43) Date of publication:

(51) Int Cl.:
G06F 9/50 (2006.01)

24.12.2008 Bulletin 2008/52

(21) Application number: 07290793.4
(22) Date of filing: 19.06.2007
(84) Designated Contracting States:

(72) Inventor: Grouzdev Vladimir

AT BE BG CH CY CZ DE DK EE ES FI FR GB GR
HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE
SI SK TR
Designated Extension States:
AL BA HR MK RS

75016 Paris (FR)

(74) Representative: Rummler, Felix et al
R.G.C. Jenkins & Co
26 Caxton Street
London SW1H 0RJ (GB)

(71) Applicant: VirtualLogix SA
78180 Montigny-le-Bretonneux (FR)

(54)

Load balancing
CPUs and a plurality of virtual CPUs, the method comprising: mapping one or more virtual CPUs to each of
said physical CPUs; and dynamically adapting the mapping depending on the load of said physical CPUs.

EP 2 006 770 A1

(57)
In a preferred embodiment, the present invention provides a method of load balancing in a data
processing system comprising a plurality of physical

Printed by Jouve, 75001 PARIS (FR)

EP 2 006 770 A1
Description
Background
5

10

15

20

[0001] A virtual machine is a self-contained execution environment that behaves as if it is a separate computer and
which can run its own operating system. Virtual machines provide Virtual CPUs (VCPUs) to clients or "guests", and each
VCPU runs on a dedicated physical CPU. A VCPU is a representation of a physical processor within a Virtual Machine.
In conventional systems, the mapping between virtual and physical CPUs is static.
[0002] If multiple CPUs are available to a client or "guest", the guest tasks are spread between the CPUs. This is
preferably done such that the available resources are used in the most efficient way and computing time is decreased.
This process is generally referred to as "load balancing".
[0003] Conventional load balancing algorithms may be insufficient. Let us consider, for example, the sharing of a
plurality of physical CPUs between dedicated real-time software and generic server software. Let us assume an UP
(uniprocessor) execution environment (e.g. LINUX) running the real-time software on CPU 0, and an SMP (symmetric
multiprocessing) execution environment (LINUX) running a server software on CPUs 0-3. In this example, CPU 0 is
shared between the the real-time software and the server software. The dedicated real-time has a higher scheduling
priority. In this example, the SMP load balancer does not take into account the real-time activity on CPU 0. This may
skew the SMP load balancing.
[0004] The present invention aims to address this and other problems of conventional load balancing. In particular,
but not exclusively, the present invention is concerned with better balancing the load of physical CPUs in a computer
system comprising physical and virtual CPUs.
Summary of the Invention

25

[0005]

The invention is recited by the independent claims. Preferred features are recited by the dependent claims.

Brief Description of the Drawings
[0006]
30

Fig. 1 illustrates schematically the architecture of a system to which the present invention can be applied; and
Figs. 2 to 5 illustrate screen shots showing CPU usage statistics with and without load balancing in accordance with
an embodiment of the invention.
35

Description of Exemplary Embodiments of the Invention
Introduction

40

45

50

[0007] Fig. 1 illustrates schematically the architecture of a system comprising physical CPUs, virtual CPUs, and client
(guest) applications, to which the present invention can be applied.
[0008] The present invention is based on the realisation that the overall performance of a system as shown in Fig. 1
can be improved by automatically balancing the load on physical CPUs attributed to the same SMP guest. In particular,
the idea is to balance the physical CPUs load by migrating a virtual CPU (VCPU) from one physical CPU to another.
Although this affects a statical VCPU to CPU assignment for a UP guest, the VCPUs to CPUs mapping within a fixed
set of CPUs is transparent for SMP guests such that a scheduler which implements the present invention can decide
which VCPU is to run on which physical CPU within the fixed CPU set. In other words, if an SMP guest is running on
two physical CPUs (a & b), the scheduler creates two VCPUs (x & y) and maps them to the physical CPUs. In this
example, two equivalent mappings are possible:
(1) VCPUx → CPUa and VCPUy → CPUb
(2) VCPUx → CPUb and VCPUy → CPUa

55

[0009] The scheduler is able to dynamically choose one of these mappings depending on CPU loads. The mapping
switch results in a "swapping" ofVCPUs, i.e. in two VCPUs migrating from one physical CPU to another. Such an operation
is fully transparent for the guest and does not change a fixed physical set of CPUs assigned to the guest.
[0010] By implementing such a load balancing mechanism, it is possible to at least partially resolve the above described
SMP load balancing skewing problem by migrating a server VCPU running on CPU0, for example, to another CPU when

2

EP 2 006 770 A1
the real-time activity is high and this VCPU is loaded. In order to improve the overall performance by such a migration,
an underloaded CPUn (n > 0, in this example) must be found and VCPU running on CPUn must migrate to CPU0.
[0011] This solution is partial only in that it does not work when the system is heavily loaded, i.e. when all physical
CPUs are fully loaded. However, as such a situation is rare in practice, it is acceptable.
5

Migration criteria

10

[0012] Regularly, at a given time period, the scheduler calculates the load of each physical CPU (LOADn, n=0..N) and
all VCPUs running on each physical CPU (VLOAD n,m, m=0..M). More particularly, two loads are calculated for each
VCPU: a "positive" load and a "negative" load.
[0013] The positive load for a given VCPUn,m is equal to the actual VCPU load:

15

[0014] The negative load for a given VCPUn,m is equal to a sum of the loads of all other VCPUs running on the same
physical CPU:
20

25

[0015]

A physical CPU is considered overloaded if its load is above a predetermined threshold:

[0016]

A physical CPU is considered underloaded if its load is below a predetermined threshold:

[0017]

Load balancing is only applied to a pair of CPUs in which one CPU is overloaded and other CPU is underloaded:

30

35

40

where
45

50

[0018] The load balancing comprises finding two unbalanced VCPUs of the same SMP guest running on CPUi and
CPUj such that:

55

and swapping these VCPUs across physical CPUs.

3

EP 2 006 770 A1
[0019] Because a VCPU migration, in terms of processing power, is a quite expensive operation, the migration criteria
is adjusted by introducing a positive migration threshold:

5

[0020]

In addition, the migration criteria takes into account the negative load of the overloaded emigrant:

10

15

[0021] The negative load water mark avoids unnecessary migrations when the CPU overloading is not caused by a
simultaneous activity of multiple guests, but rather by a single guest monopolizing the physical CPU.
Specific Implementation

20

[0022] A mymips program has been used to demonstrate skewing in the SMP load balancing of a guest operating
system. The mymips program permanently calculates the program execution speed (MIPS) and prints out the calculation
results on console.
mymips provides the following result when running on a SMP Linux guest with a dedicated single physical CPU:
min/max/ave: 258/258/258

25

[0023]
[0024]

30

35

The results above and below were obtained on a DELL D820 Dual Core 1.8 MHz Laptop.
Two mymips programs provide the following result when running on an SMP Linux with two dedicated CPUs:

min/max/ave: 257/259/258
min/max/ave: 258/259/258
[0025] A basic configuration which can be used to implement the load balancing mechanism in accordance with an
embodiment of the invention comprises two SMP Linux guests sharing two physical CPUs. In order to obtain an unbalanced load on such a configuration, guests have been running on a conventional system without load balancing mechanism. Two mymips programs running simultaneously on each guest provide the following results:
min/max/ave: 101/258/190
min/max/ave: 92/257/190

40

45

50

55

[0026] This shows about 25% of performance hit comparing to a single SMP Linux which performs a load balancing
(at OS level) across multiple CPUs. The performance hit is due to sporadic mymips migrations from one CPU to another.
Such a migration randomly runs mymips on the same processor.
[0027] This practical result is fully in line with a theoretical determination. Because of a random nature of migrations,
the probability of running both mymips on the same CPU is 0.5. Thus, an expected performance hit is 0.25 because
when running two programs on the same CPU, only a half of the CPU power is available.
[0028] Figs. 2 illustrates a screen shot showing CPU usage statistics without load balancing in accordance with an
embodiment of the invention. The screen shot represents the scenario described above.
[0029] When running the same load with load balancing enabled, the performance is close to a single SMP Linux.
min/max/ave: 254/257/256
min/max/ave: 250/256/255
[0030] The load balancing compensates sporadic migrations of mymips (from one VCPU to another) caused by the
Linux SMP scheduler. In other words, the scheduler tries to execute heavy loaded VCPUs on different physical CPUs.
[0031] Figs. 3 illustrates a screen shot showing CPU usage statistics with load balancing in accordance with an
embodiment of the invention. The screen shot represents the scenario described above.
[0032] In order to confirm the above theoretical conclusion, a Linux kernel compilation was used as a variable load.
Two compilations were running in parallel on two Linux guests in the following three configurations:

4

EP 2 006 770 A1
(1) Two Linux guests each running on a dedicated CPU
(2) Two Linux guests sharing two CPUs without load balancing
(3) Two Linux guests sharing two CPUs with load balancing
5

[0033] Each time, the duration of compilation was measured. Results corresponding to different Linux kernel compilations are shown below.
(1.1) 11m21.046s
(1.2) 8m34.204s

10

(2.1) 16m4.272s
(2.2) 12m20.575s
15

(3.1) 13m51.974s
(3.2) 10m32.467s
20

25

[0034] The performance hit on the system without load balancing (2) is about 40%, while the performance hit on the
system with load balancing (3) is about 20%. Accordingly, the load balancing improves a performance degradation
caused by a transparent CPU sharing among multiple SMP guests.
[0035] Figs. 4 and 5 illustrate screen shots showing CPU usage statistics with and without load balancing in accordance
with an embodiment of the invention. The screen shots represent the scenarios described above.
[0036] The above results were obtained using the following load balancing parameters:

PERIOD = 10 milliseconds
LOADunder = 80%
30

LOADover = 100%

MIGR_POS_ WTMARK= 5 %
MIGR_NEG_WTMARK = 5%

35

[0037]

It may be possible to achieve even better result on a variable load by modifying these parameters.

Other aspects and embodiments
40

[0038] It will be clear from the forgoing that the above-described embodiments are only examples, and that other
embodiments are possible and included within the scope of the invention as determined from the claims.

45

Claims
1.

mapping one or more virtual CPUs to each of said physical CPUs; and
dynamically adapting the mapping depending on the load of said physical CPUs.

50

2.
55

A method of load balancing in a data processing system comprising a plurality of physical CPUs and a plurality of
virtual CPUs, the method comprising:

The method of claim 1, comprising:
running a multiprocessor operation involving at least two physical CPUs; and
assigning at least two virtual CPUs to said multiprocessor operation, for executing said multiprocessor operation;
and
mapping the at least two virtual CPUs to said at least two physical CPUs, respectively, wherein the mapping

5

EP 2 006 770 A1
depends on the load of said at least two physical CPUs.
3.

The method of claim 2, comprising:
swapping the mapping of the two virtual CPUs to the at least two physical CPUs, respectively, in response to
a change of load of at least one of the two physical CPUs.

5

4.

The method of claim 2 or 3, comprising:

10

determining the load of said two virtual CPUs;
determining whether the load of said two physical CPUs is above a first threshold or below a second threshold; and
swapping the mapping of the two virtual CPUs to the two physical CPUs, respectively, if the following conditions
are met:

15

(i) the load of one of the two physical CPUs is above said first threshold,
(ii) the load of the other one of the two physical CPUs is below said second threshold, and
(iii) the load of a virtual CPU mapped to the physical CPU whose load is above said first threshold is higher
than the load of a virtual CPU mapped to the physical CPU whose load is below said second threshold.

20

5.

The method of claim 4, comprising:
swapping the mapping only if the difference of loads of said two virtual CPUs is above a predetermined threshold.

6.

The method of claim 4 or 5, comprising:

25

determining, for a given virtual CPU, a negative load indicative of the load of all other virtual CPUs that are
allocated to the same physical CPU as the given virtual CPU; and
swapping the mapping only if the negative load is above a predetermined threshold.
30

7.

The method of claim 2, comprising:
running an operation of a high scheduling priority on one of said at least two physical CPUs, wherein said
multiprocessor operation has a relatively lower scheduling priority.

35

40

8.

The method of claim 7, wherein the operation of a higher scheduling priority is executed by a dedicated real-time
software.

9.

The method of claim 7 or 8, wherein said multiprocessor operation is executed by a generic server software.

10. Computer system, arranged to perform the method of any preceding claim.
11. Computer program product, comprising machine-readable code which, when executed by a data processing device,
executes the method of any of claims 1 to 9.

45

50

55

6

EP 2 006 770 A1

7

EP 2 006 770 A1

8

EP 2 006 770 A1

9

EP 2 006 770 A1

10

EP 2 006 770 A1

11

EP 2 006 770 A1

12

EP 2 006 770 A1

13

EP 2 006 770 A1

14

EP 2 006 770 A1

15

EP 2 006 770 A1

16

Load Balancing

Comments

Content

Sponsor Documents

Recommended