Often, we can easily solve a problem in O(n2) time using a brute force algorithm, but such an algorithm is too slow if the input size is large. In fact, a frequent goal in algorithm design is to find O(n) or O(n log n) time algorithms for problems that can be trivially solved in O(n2) time. Sorting is one way to achieve this goal.

For example, suppose that we want to check if all elements in an array are unique.

A brute force algorithm goes through all pairs of elements in O(n2) time:

bool ok = true;

for (int i = 0; i < n; i++) {

for (int j = i+1; j < n; j++) {

if (array[i] == array[j]) ok = false;

}

}

However, we can solve the problem in O(n log n) time by first sorting the array. Then, if there are equal elements, they are next to each other in the sorted array, so they are easy to find in O(n) time:

bool ok = true;

sort(array, array+n);

for (int i = 0; i < n-1; i++) {

if (array[i] == array[i+1]) ok = false;

}

Several other problems can be solved in a similar way in O(n log n) time, such as counting the number of distinct elements, finding the most frequent element, and finding two elements whose difference is minimum.

## Sweep Line Algorithms:

A sweep line algorithm models a problem as a set of events that are processed in a sorted order. For example, suppose that there is a restaurant and we know the arriving and leaving times of all customers on a certain day. Our task is to find out the maximum number of customers who visited the restaurant at the same time.

For example, shows an instance of the problem where there are four customers A, B, C, and D. In this case, the maximum number of simultaneous customers is three between A’s arrival and B’s leaving.

To solve the problem, we create two events for each customer: one event for arrival and another event for leaving. Then, we sort the events and go through them according to their times. To find the maximum number of customers, we maintain a counter whose value increases when a customer arrives and decreases when a customer leaves. The largest value of the counter is the answer to the problem.

Shows the events in our example scenario. Each customer is assigned two events: “+” denotes an arriving customer and “−” denotes a leaving customer. The resulting algorithm works in O(n log n) time, because sorting the events takes O(n log n) time and the sweep line part takes O(n) time.

## Scheduling Events:

Many scheduling problems can be solved by sorting the input data and then using a greedy strategy to construct a solution. A greedy algorithm always makes a choice that looks the best at the moment and never takes back its choices.

As an example, consider the following problem: Given n events with their starting and ending times, find a schedule that includes as many events as possible. For example, shows an instance of the problem where an optimal solution is to select two events.

In this problem, there are several ways how we could sort the input data. One strategy is to sort the events according to their lengths and select as short events as possible. However, this strategy does not always work, as shown in the image Then, another idea is to sort the events according to their starting times and always select the next possible event that begins as early as possible. However, we can find a counterexample also for this strategy, shown in image.

A third idea is to sort the events according to their ending times and always select the next possible event that ends as early as possible. It turns out that this algorithm always produces an optimal solution. To justify this, consider what happens if we first select an event that ends later than the event that ends as early as possible. Now, we will have at most an equal number of choices left how we can select the next event. Hence, selecting an event that ends later can never yield a better solution, and the greedy algorithm is correct.

### Tasks and Deadlines:

Finally, consider a problem where we are given n tasks with durations and deadlines and our task is to choose an order to perform the tasks. For each task, we earn d − x points where d is the task’s deadline and x is the moment when we finish the task.

What is the largest possible total score we can obtain?

shows an optimal schedule for the tasks in our example scenario.

Using this schedule, C yields 6 points, B yields 5 points, A yields −7 points, and D yields 2 points, so the total score is 6.

It turns out that the optimal solution to the problem does not depend on the deadlines at all, but a correct greedy strategy is to simply perform the tasks sorted by their durations in increasing order. The reason for this is that if we ever perform two tasks one after another such that the first task takes longer than the second task, we can obtain a better solution if we swap the tasks.

For example, in image, there are two tasks X and Y with durations a and b. Initially, X is scheduled before Y . However, since a > b, the tasks should be swapped. Now X gives b points less and Y gives a points more, so the total score increases by a − b > 0. Thus, in an optimal solution, a shorter task must always come before a longer task, and the tasks must be sorted by their durations.