This new algorithm for sorting books or files is almost perfection

The original version from This story appeared in How much magazine.

Computer scientists often deal with abstract problems that are difficult to understand, but an exciting new algorithm important for anyone who has books and at least one shelf. The algorithm deals with something that describes the problem of library sorting (formal, the problem of listing drawing). The challenge is to develop a strategy for organizing books in a kind of sorted order – for example, so that it takes as long as it takes to place a new book on the shelf.

For example, imagine that you hold your books together and leave the empty space on the far right of the shelf. If you then add a book by Isabel Allende of your collection, you may have to move every book to the shelf to make room for it. That would be a time -consuming operation. And when you get a book from Douglas Adams, you have to do it again. A better arrangement would leave vacant rooms on the entire shelf – but how exactly should they be distributed?

This problem was introduced in A 1981 paperAnd it goes beyond organizational guidelines beyond the single provision of librarians. This is because the problem also applies to the arrangement of files to hard drives and in databases, in which the elements to be arranged can count in billions. An inefficient system means considerable waiting times and important computing effort. Researchers have invented some efficient methods of storing objects, but they wanted to determine the best possible way for a long time.

Last year in a study This was presented at the foundations of the computer science conference in Chicago, a team of seven researchers who described a way to organize objects that come close to the theoretical ideal. The new approach combines a little knowledge of the content of the bookcase with the surprising power of randomness.

“It’s a very important problem,” said Seth PettieA computer scientist at the Michigan University, because many of the data structures we rely on today keep it one after the other. He called the new work “extremely inspired (and) easily one of my three favorite papers of the year”.

Narrow boundaries

So how do you measure a well -stocked bookshelf? A common way to see how long it takes to insert a single element. Of course, this depends on how many elements there are at all, a value that is normally designated by N. In the example of Isabel Allende, when all books have to be handled to record a new one, the time it takes is proportionally too N. The bigger the NThe longer it takes. This makes this a “upper limit” of the problem: it will never take longer than a time proportional to N Add a book to the shelf.

The authors of the 1981 paper who initiated this problem N. And in fact they have proven that it could be done better. They created an algorithm that guarantees an average insertion period proportional (log N)))². This algorithm had two properties: it was “deterministic”, which means that his decisions did not stop from randomness, and it was also “smooth”, which means that the books have to be distributed evenly within sub -sections of the shelf, in which Inserts (or deletions) must be distributed evenly in order to be evenly distributed in which the books must be distributed evenly, in which insertions (or extinguishing) must be spread evenly. The authors left the question of whether the upper limit could be further improved. Nobody managed to do this for over four decades.

However, improvements to the lower limit have been determined in the intermediate years. While the upper limit indicates the maximum possible time required to insert a book, the lower limit results in the fastest possible insertion time. In order to find a final solution to a problem, the researchers strive to anchor the gap between the upper and lower borders, ideally until they match. In this case, the algorithm is considered optimal – immediately limited from above and below, which means that no space for further refinement remains.

Source link

Spread the love