Fractal Compression of Medical Images Wojciech Walczak Master Thesis Software Engineering

Fractal Compression of Medical Images Wojciech Walczak Master Thesis Software Engineering

Master Thesis

Software Engineering

Thesis no: MSE-2008:05

February 2008

Fractal Compression of Medical Images

Wojciech Walczak

School of Engineering

Blekinge Institute of Technology

Box 520

SE – 372 25 Ronneby

Sweden

This thesis is submitted to the School of Engineering at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Software

Engineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information:

Author:

Wojciech Walczak

Address: ul. Krzętowska 3A, 97-525 Wielgomłyny, Poland

E-mail: [email protected]

University advisors:

Bengt Aspvall

School of Engineering

Blekinge Institute of Technology, Sweden

Jan Kwiatkowski

Institute of Applied Informatics

Wrocław University of Technology, Poland

School of Engineering

Blekinge Institute of Technology

Box 520

SE – 372 25 Ronneby

Sweden

Internet : www.bth.se/tek

Phone

Fax

: +46 457 38 50 00

: + 46 457 271 25

ii

Faculty of Computer Science and Management

field of study: Computer Science

specialization: Software Engineering

Master Thesis

Fractal Compression of Medical Images

Wojciech Walczak

keywords: fractal compression fractal magnification medical imaging short abstract:

The thesis investigates the suitability of the fractal compression to medical images. The fractal compression method is selected through a survey of the literature and adapted to the domain. The emphasis is put on shortening the encoding time and minimizing the loss of information.

Fractal magnification, the most important advantage of the fractal compression, is also discussed and tested. The proposed method is compared with existing lossy compression methods and magnification algorithms.

Supervisor:

Jan Kwiatkowski

............................................

......................

.......................

Name Grade Signature

Wrocław 2008

Contents

Introduction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Research Aim and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

Chapter 1. Digital Medical Imaging

. . . . . . . . . . . . . . . . . . . . . . . . .

4

1.1.

Digital Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.1.1.

Analog and Digital Images . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.1.2.

Digital Image Characterisitcs . . . . . . . . . . . . . . . . . . . . . . .

5

1.2.

Medical Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

Chapter 2. Image Compression

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

2.1.

Fundamentals of Image Compression . . . . . . . . . . . . . . . . . . . . . . .

13

2.1.1.

Lossless Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.1.2.

Lossy Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.2.

Fractal Compression

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.2.1.

Decoding

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

2.2.2.

Encoding

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

2.3.

Fractal Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

Chapter 3. Fractal Compression Methods

. . . . . . . . . . . . . . . . . . . . .

28

3.1.

Partitioning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3.1.1.

Uniform Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.1.2.

Overlapped Range Blocks . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.1.3.

Hierarchical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.1.4.

Split-and-Merge Approaches . . . . . . . . . . . . . . . . . . . . . . . .

33

3.2.

Domain Pools and Virtual Codebooks . . . . . . . . . . . . . . . . . . . . . . .

35

3.2.1.

Global Codebooks

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

3.2.2.

Local Codebooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

3.3.

Classes of Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

3.3.1.

Spatial Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

3.3.2.

Symmetry Operations . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.3.3.

Block Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41 ii

3.4.

Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

3.4.1.

Quantization During Encoding . . . . . . . . . . . . . . . . . . . . . .

43

3.4.2.

Quantization During Decoding

. . . . . . . . . . . . . . . . . . . . . .

44

3.5.

Decoding Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

3.5.1.

Pixel Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

3.5.2.

Successive Correction Decoding . . . . . . . . . . . . . . . . . . . . . .

45

3.5.3.

Hierarchical Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

3.5.4.

Decoding with orthogonalization . . . . . . . . . . . . . . . . . . . . .

45

3.6.

Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

3.7.

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

Chapter 4. Accelerating Encoding

. . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.1.

Codebook Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.2.

Invariant Representation and Invariant Features . . . . . . . . . . . . . . . . .

49

4.3.

Nearest Neighbor Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

4.4.

Block Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

4.4.1.

Classification by Geometric Features . . . . . . . . . . . . . . . . . . .

50

4.4.2.

Classification by intensity and variance . . . . . . . . . . . . . . . . . .

51

4.4.3.

Archetype classification

. . . . . . . . . . . . . . . . . . . . . . . . . .

51

4.5.

Block Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

4.6.

Excluding impossible matches . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

4.7.

Tree Structured Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

4.8.

Multiresolution Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

4.9.

Reduction of Time Needed for Distance Calculation . . . . . . . . . . . . . . .

54

4.10. Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

Chapter 5. Proposed Method

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

5.1.

The Encoding Algorithm Outline . . . . . . . . . . . . . . . . . . . . . . . . .

57

5.2.

Splitting the Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

5.3.

Codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

5.3.1.

On-the-fly Codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

5.3.2.

Solid Codebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

5.3.3.

Hybrid Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

5.4.

Symmetry Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

5.5.

Constructing the fractal code

. . . . . . . . . . . . . . . . . . . . . . . . . . .

64

5.5.1.

Standard Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

5.5.2.

New Approach

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

5.5.3.

Choosing the Parameters

. . . . . . . . . . . . . . . . . . . . . . . . .

74

5.5.4.

Adaption to irregular regions coding . . . . . . . . . . . . . . . . . . .

77

5.6.

Time Cost Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

5.6.1.

Variance-based Acceleration . . . . . . . . . . . . . . . . . . . . . . . .

78

5.6.2.

Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

5.7.

Summary of the Proposed Compression Method . . . . . . . . . . . . . . . . .

80

Chapter 6. Experimental Results

. . . . . . . . . . . . . . . . . . . . . . . . . . .

82

6.1.

Block Splitting

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

6.2.

Number of Bits for Scaling and Offset Coefficients . . . . . . . . . . . . . . . .

86

6.3.

Coding the Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

6.4.

Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

6.4.1.

Codebook Size Reduction . . . . . . . . . . . . . . . . . . . . . . . . .

89

6.4.2.

Breaking the Search . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90 iii

6.4.3.

Codebook Optimization . . . . . . . . . . . . . . . . . . . . . . . . . .

92

6.4.4.

Variance-based Acceleration . . . . . . . . . . . . . . . . . . . . . . . .

92

6.4.5.

Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

6.4.6.

Spiral Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

6.5.

Comparison with JPEG

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

6.6.

Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

Conclusions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

Research Question 1: Is it possible, and how, to minimize the drawbacks of the fractal compression method to satisfying level in order to apply this method to medical imaging? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

Research Question 2: Which fractal compression method suits best for medical images and gives best results? . . . . . . . . . . . . . . . . . . . . . . . . . . .

102

Research Question 3: Do the fractal compression preserve image quality better or worse than other irreversible (information lossy) compression methods? . . . .

102

Research Question 4: Can the results of fractal magnification be better than the results of traditional magnification methods? . . . . . . . . . . . . . . . . . . .

103

Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

103

Appendix A. Sample Images

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

105

Appendix B. Glossary

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

139

Appendix C. Application Description and Instructions for Use

. . . . . . . .

149

C.1. How to Run the Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

149

C.2. Common Interface Elements and Functionalities . . . . . . . . . . . . . . . . .

149

C.2.1. Menu Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

149

C.2.2. Tool Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

151

C.2.3. Status Bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

151

C.2.4. Pop-up Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

152

C.3. WoWa Fractal Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

152

C.3.1. Original Tab

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

152

C.3.2. Comparison Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

153

C.3.3. Partitions Tab

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

153

C.3.4. Transformations Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . .

154

C.3.5. Log Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

154

C.3.6. Image Comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155

C.4. WoWa Fractal Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155

C.5. Settings

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155

C.5.1. Application Preferences

. . . . . . . . . . . . . . . . . . . . . . . . . .

156

C.5.2. Encoder Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

156

C.5.3. Decoder Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157

Appendix D. Source Code and Executable Files

. . . . . . . . . . . . . . . . .

158

List of Figures

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

159

Bibliography

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

162 iv

Abstract

Obrazy medyczne, tak jak inne dane cyfrowe, wymagają kompresji aby zredukować wymaganą do przechowywania ilość pamięci oraz czas potrzebny na transmisję.

Kompresja bezstratna może zmniejszyć wielkość pliku tylko do bardzo ograniczonego stopnia. Zastosowanie kompresji fraktalnej do obrazów medycznych pozwoliłoby osiągnąć znacznie wyższe współczynniki kompresji. Natomiast powiększenie fraktalne – nieodłączna cecha kompresji fraktalnej – byłaby niezwykle pożyteczna w prezentacji zrekonstruowanego obrazu w postaci bardzo czytelnej.

Aczkolwiek kompresja fraktalna, jak wszystkie metody stratne, jest związana z problemem utraty informacji, który jest szczególnie uciążliwy w obrazowaniu medycznym. Bardzo czasochłonny proces kodowania, który może trwać nawet kilka godzin, jest kolejną kłopotliwą wadą kompresji fraktalnej. Na podstawie przeglądu literatury oraz własnych przemyśleń, autor usiłuje przedstawić rozwiązanie dostosowane do potrzeb obrazowania medycznego, które przezwycięży niekorzystne przypadłości metod kompresji fraktalnej. Praca zawiera nie tylko rozważania teoretyczne ale również dostarcza implementacji zaproponowanego algorytmu, który jest wykorzystany w celu zbadania odpowiedniości kompresji fraktalnej do obrazowania medycznego. Otrzymane wyniki są więcej niż satysfakcjonujące – wierność obrazów skompresowanych proponowaną metodą kompresji fraktalnej spełnia wymagania nakładane obrazom medycznym a powiększenie fraktalne przewyższa inne techniki powiększania obrazów.

Abstract

Medical images, like any other digital data, require compression in order to reduce disk space needed for storage and time needed for transmission. The lossless compression methods of still images can shorten the file only to a very limited degree.

The application of fractal compression to medical images would allow obtaining much higher compression ratios. While the fractal magnification – an inseparable feature of the fractal compression – would be very useful in presenting the reconstructed image in a highly readable form. However, like all irreversible methods, the fractal compression is connected with the problem of information loss, which is especially troublesome in the medical imaging. A very time consuming encoding process, which can last even several hours, is another bothersome drawback of the fractal compression. Based on a survey of literature and own cogitations, the author attempts to provide an adapted to the needs of medical imaging solution that will overcome the unfavorable ailments of the fractal compression methods. The thesis does not provide only theoretical deliberations but also gives implementation of the proposed algorithm, which is used to test the suitability of the fractal compression to medical imaging. The results of the work are more than satisfying – the fidelity of the images compressed with the proposed fractal compression method meets the requirements imposed on the medical images and the fractal magnification outperforms other magnification techniques.

v

Streszczenie

Systemy informacyjne w szpitalach czy przychodniach przechowują ogromną ilość wyników badań. Długookresowe przechowywanie danych medycznych może dawać wymierne korzyści, ponieważ lekarze mogą w dowolnej chwili zapoznać się z wynikami poprzednich badań i historią przypadku. Szpitalne bazy danych szybko się rozrastają, co jest skonfrontowane z ograniczonymi pojemnościami urządzeń magazynujących.

Konieczne są więc nakłady finansowe na zwiększenie tej pojemności. Szybko rozwija się również telemedycyna. Coraz większą popularnością cieszą się ostatnio zdalne operacje.

Podczas takich operacji jeden ze specjalistów, mimo że oddalony od miejsca wykonywania operacji, może na bieżąco oglądać jej przebieg dzięki transmisji obrazów. Niestety jakość takiego obrazu jest ograniczona przez przepustowość łącza internetowego, którym transmisja się odbywa.

Kompresja danych może być wielce pomocna zarówno w długoterminowym przechowywaniu jak i w transmisji danych obrazowych. Jednak w przypadku obrazów medycznych, gdzie właściwa ocena stanu pacjenta zależy od jakości zdjęcia, szczególnie ważna jest informacja niesiona przez zdjęcie.

Celem pracy jest sprawdzenie, czy kompresja fraktalna może być zastosowana do obrazów medycznych. Wiąże się to z koniecznością zminimalizowania wad kompresji fraktalnej, z których najważniejsze to nieodwracalna utrata części informacji niesionej przez obraz oraz długi czas kompresji. Jednakże kompresja fraktalna ma również zalety, z których największą jest możliwość fraktalnego powiększania obrazów. Zatem praca ta próbuje odpowiedzieć na pytanie czy i jak można zmniejszyć wady kompresji fraktalnej do takiego poziomu, by możliwe było wykorzystanie jej do kompresji obrazów medycznych. Kompresja fraktalna jest ponadto porównywana z innymi metodami kompresji stratnej, jak również z innymi algorytmami powiększania obrazów. Aby osiągnąć cel, następujące kroki zostały wykonane:

1. przegląd metod kompresji fraktalnej na podstawie literatury

2. przegląd technik przyśpieszania procesu kompresji

3. implementacja wybranej metody kompresji fraktalnej, najlepiej pasującej do dziedziny obrazowania medycznego

4. implementacja wybranych metod przyśpieszania

Streszczenie

vii

5. eksperymenty mające na celu oszacowanie ilości traconej informacji oraz czasu kompresji wybraną metodą

6. eksperymenty mające na celu pomiar wpływu zaimplementowanych technik przyśpieszania na czas kompresji

7. porównanie wyników uzyskanych zaimplementowaną metodą kompresji fraktalnej z innymi stratnymi metodami kompresji oraz algorytmami powiększania

W efekcie okazało się, że kompresja fraktalna może być z powodzeniem stosowana w kompresji obrazów medycznych. Kompresowane obrazy USG zachowywały akceptowalną wierność dla obrazów medycznych przy dziewięciokrotnym zmniejszeniu długości pliku z danymi obrazowymi. Długość pliku skompresowanego bezstratnie byłaby około dwa razy większa.

Również długość czasu potrzebnego do kompresji udało się z powodzeniem ograniczyć. Obraz typowych obrazów można opracowaną metodą skompresować w czasie nie dłuższym niż kilkadziesiąt sekund. A obraz wielkości 256

×

256 w zaledwie około

5 sekund.

W porównaniu z algorytmem JPEG, opracowana metoda kompresji fraktalnej wypada słabiej pod względem jakości obrazów dla współczynnika kompresji mniejszego niż 14 : 1, natomiast dla wyższych współczynników niż 18 : 1 kompresja fraktalna okazuje się lepsza. Miary objektywne nie wskazują jednoznacznie, która metoda daje wierniejszy obraz dla współczynników kompresji między 14 : 1 a 18 : 1. Jednak JPEG jest pozbawiony możliwości kompresji fraktalnej, w tym najważniejszej – dekompresji obrazu do dowolnego rozmiaru, czyli powiększenia fraktalnego.

Powiększenie fraktalne porównywane było z interpolacją liniową i sześcienną, jednak żadna z tych metod nie mu dorównała. Obraz powiększony fraktalnie był znacznie ostrzejszy a detale i krawędzie lepiej widoczne. Również porównanie jakości powiększonych zdjęć obiektywnymi miarami jednoznacznie wskazało na wyższość powiększenia fraktalnego.

Introduction

Background

More and more fields of human’s life are becoming computerized nowadays. This determines generation of huge, and further increasing, amount of information stored in digital form. All this is possible thanks to technological progress in registration of different kinds of data. This progress is also being observed in wide field of digital images, which covers scanned documents, drawings, images from digital or video cameras, satellite images, medical images, works of computer graphics and many more.

Many disciplines, like medicine, e-commerce, e-learning or multimedia, are bounded with ceaseless interchange of digital images. A live on-line transmission of a sport event, or a surgery with a remote participation of one or more specialist, teleconference in a world wide company constitute great examples. Such utilization of technology related to digital images becomes nowadays very popular.

Long-lasting storage of any data often can be very profitable. In medicine, Hospital

Information Systems contain a large number of medical examination results. Thanks to them doctors can familiarize themselves with the case history and make a diagnosis based on many different examination results. Such systems are also very useful for the patients because they gain access to their medical data. A very good example is IZIP

– Czech system, which gives Internet access to patients’ health records. Unfortunately, these hospital databases are growing rapidly – each day tens or hundreds of images are produced and most of them, or even all, are archived for some period.

Both mentioned aspects of digital data – sharing and storage, are linked with problems that restrain the progress in new technologies and growth of their application prevalence. During exchanging image data, one wishes to keep the quality on a high level and the time needed for transmission and the disk space needed for storage as low as it can be. The increase of throughput in used communication connections unfortunately is insufficient and some additional solution must be introduced to satisfy ascending expectations and needs.

Collecting any kind of data results in demand for increase of storage devices’ capacity. Since the capacity growth of such devices is quite fast, almost any demand can be

Introduction

2 technically satisfied. However with extending of the capacity expenses, which cannot be passed over, are related.

Above-mentioned problems resulted in research and development of data compression techniques. With time many different compression methods, algorithms and file formats were developed. In still images compression there are many different approaches and each one of them produces many compression methods. However all techniques prove to be useful only in a limited usage area.

Of course, image compression methods are also much desired or even necessary in medicine. However, medical images require special treatment because correctness of diagnosis depends on it. Low quality medical image, distortions in the image or untrue details may be harmful for human health. Thus any processing of such images, including compression, should not interfere in the information carried by the images.

Research Aim and Objectives

The aim of the thesis is to investigate if it is possible to apply fractal compression to medical images in order to bring the benefits of fractal compression to medical imaging.

Fractal compression has some serious drawbacks, like information loss problem or long encoding time, that can thwart the plans of utilizing fractal compression for medical images. However, it also offers some great features, like compression ratios comparable with JPEG standard, fractal magnification, asymmetric compression, and absence of most distortion effects. These positive aspects of fractal compression would be very useful in medical imaging.

In author’s opinion, the aim of the thesis can be reached by finding or constructing a fractal compression method that best fulfills needs and requirements of medical imaging and, at the same time, gives all advantages of fractal compression. Thus, correspondence of existing fractal compression techniques to the domain is investigated and best suiting methods and techniques are adapted and improved.

Following objectives are realized to attain the goal:

Review and discuss fractal compression methods with special consideration of their suitability to medical imaging.

Implementation of the chosen fractal compression algorithms.

Perform experiments in order to evaluate the size of information loss in implemented algorithm.

Perform experiments in order to acquire information about time needed for encoding in chosen (and improved) method/methods.

Review speed-up techniques and adapt chosen techniques to the fractal compression method, implement them and measure their impact on the duration of encoding.

Compare the results of implemented algorithm with other magnification and lossy compression techniques.

Research Questions

The thesis addresses the following research questions:

Introduction

3

Research Question 1

Is it possible, and how, to minimize the drawbacks of the fractal compression method to satisfying level in order to apply this method to medical imaging?

Research Question 2

Which fractal compression method suits best for medical images and gives best results?

Research Question 3

Do the fractal compression preserve image quality better or worse than other irreversible (information lossy) compression methods?

Research Question 4

Can the results of fractal magnification be better than the results of traditional magnification methods?

Thesis Outline

This work is organized as follows:

Introduction

Chapter 1

explains basic concepts of digital imaging and discusses characteristics and special requirements of medical imaging.

Chapter 2

provides basic information about fractal image compression, what is preceded with general description of image compression.

Chapter 3

goes into more details about fractal compression and characterizes the large range scope of existing fractal coding methods. The emphasis has been made on features important for compression of medical images.

Chapter 4

scrutinizes a variety of attempts to solve one of most important disadvantages of fractal compression – long encoding time.

Chapter 5

gives a look inside the implemented algorithm.

Chapter 6

presents and discusses the results of experiments that were performed on the implementation of the proposed fractal compression method.

Conclusions

present the discussion of the results and recommendations.The answers to the research questions can be found here.

The first two chapters compose the introductory part of the thesis and can be skipped by knowledgeable readers.

Chapter 1

Digital Medical Imaging

Before medical imaging will be discussed, it is necessary to provide some basic information about digital imaging. The digital images are described in the first section and the next section concentrates on a specific class of digital images – digital medical images.

Since this is an introductory chapter, it can be skipped by the readers knowledgeable in medical digital imaging. The readers who are familiar with digital imaging may omit the first section of this chapter.

1.1. Digital Images

1.1.1. Analog and Digital Images

Two classes of digital images can be distinguished – analog and digital images. Both types fall into nontemporal multimedia type. Analog images are painted or created through photographic process. During this process, the image is captured by a camera on a film that becomes a negative. We have a positive when the film is developed

– no processing is possible from this moment. When the photography is made on a transparent medium then we are dealing with a diapositive (slide). Analog images are characterized by continuous, smooth transition of tones. This means that between each two different points at the picture there is an infinite number of tonal values.

It is possible to transform an analog image into digital. The digitization process is usually caused by a need of digital processing. The output of digitalization is a digital approximation of the input analog image – the analog image is replaced by a set of pixels (points organized in rows and columns) and every pixel has a fixed, discrete tone value. Therefore, the image is not a continuous tone of colors. The precision and accuracy of this transformation depends on the size of a pixel – the larger area of an

analog image transformed into one pixel the less precise approximation. [Gon02]

A digital image can be captured with a digital camera, scanner or created with a graphic program. Transition from digital to analog image also takes place – by such devices as computer monitor, projector or printing device.

Chapter 1. Digital Medical Imaging

5

One can distinguish many different types of digital images. First of all the digital images are divided into recorded and synthesized images. To the first group, for example, belong analog images scanned by digital scanner. To the second group are classed all images created with graphical computer programs – they come into being already as digital images.

The second possible classification of digital images divides them into vector images and raster images. Both of the groups can contain recorded as well as synthesized images. Vector images mostly are created with graphic software. Analog images can be recorded only to a raster image, but then they can be converted to vector image. The opposite conversion (rasterization) is also possible. Vector images are treated as a set of mathematically described shapes and most often are used in creating drawings like logos, cartoons or technical drawings. This work concerns only raster graphics, where an image (bitmap) is defined as set of pixels (picture elements) filled with color identified by a single discrete value. This kind of images is usually used for photographical images.

[Dal92]

1.1.2. Digital Image Characterisitcs

Digital images are characterized by multiple parameters.

The first feature of a digital image is its color mode. A digital image can have one of three modes: binary, grayscale or color. A binary (bilevel) image is an image in which only two possible values for each pixel. A grayscale image means that its each pixel can contain only a tint of gray color.

As it was already mentioned, a digital image is a set of pixels. Each pixel has a value that defines color of the pixel. All the pixels are composed into one array. The resolution

of a digital image is the number of pixel within a unit of measure [AL99]. Typically,

the resolution is measured in pixels per inch (ppi). The higher image resolution the better is its quality. The image resolution can also be understood as dimension of the

pixel array specified with two integers [Dal92]:

number of pixel columns

× number of pixel rows

Bit depth, called also color depth or pixel depth, stands for how many bits are destined for description of color for each pixel. Higher color depth means that more colors are available in the image, but at the same time, it means that more disk space is needed for storage of the image. Monochrome images use only one bit per pixel, and grayscale images engage usually 8 bits, which gives 256 gray levels. Color images can have pixel depth equal 4, 8 or 16 bits; full color can be achieved with 24 or 32 bits.

Colors can be described in various ways. Next digital images’ feature – color model not only specifies how colors are represented, but also determines the spectrum of possible colors of pixels. The gamut of colors that can be displayed or printed depends on color model that is employed. This is why a digital image in a particular color model can use only a portion of visible spectrum – this portion is characteristic for the model.

There are many different color models and most popular are: RGB, CMY, CMYK,

HSB (HSV), HLS, YUV, and YIQ. These color models are divided into two classes:

“subtractive” and “additive”. CMY (Cyan, Magenta, and Yellow) and CMYK (Cyan,

Magenta, Yellow, and Black) are “subtractive” models. One should make use of one of model from this class when printing inks (with presence of external light that is being

Chapter 1. Digital Medical Imaging

6 reflected by printed image) must be employed to display color. RGB (Red, Green, Blue), one of “additive” models, is used when displaying color with emission of light – e.g.

image is displayed by computer display monitor. The main difference between these two kinds of color models is that in subtractive models black is achieved by combining colors and in additive models in this way is produced white. In subtractive models, colors are displayed thanks to the light absorbed (subtracted) by inks. While in additive models, colors are displayed thanks to the transmitted (added) light. HSB (Hue, Saturation, and Brightness), also called HSV (Hue, Saturation, and Value), is more intuitive – color of a pixel is specified by three values: hue – the wavelength of light, saturation

– the amount of white in the color and brightness – the intensity of color. Similar to

HSB and also very intuitive is HLS (Hue, Lightness, and Saturation). The YUV color model is a part of PAL system in television and contains three components – one for luminance and two for chrominance. YIQ also has one component for luminance and two components for chrominance and it is used in NTSC television.

Channels are closely related with color models. A channel is a grayscale image that reflects one of color model component (base of color in used color mode). Channels have same size as the original image. Thus an image in RGB will have 3 channels:

Red color, Green color, Blue color, in CMYK four channels: Cyan color, Magenta color, Yellow color, Black color, HSV will have three channels: Hue value, Saturation value, Brightness value, a grayscale image will have only one channel, etc. There can be additional channels called Alpha Channels. An Alpha channel stores information about transparency of pixels. Therefore, the number of channels, although it partially depends on the color model, is also a feature of a digital image.

Colors’ indexing is next image feature related to color model. Indexed color model is only an option and means that number of colors that can be used is limited to a fixed number (e.g. in GIF to 256) in order to reduce the bit depth and size of whole file.

Most often indexing is done automatically in accordance to standard palettes or system palettes. Palettes in different operating systems are not the same – they only partially overlap. From 216 colors that are common for operating systems a standard palette was created for purpose of World Wide Web. There are also other standard palettes

– a palette with 16 colors is commonly used for simple images. Besides indexing to standard/system palettes, there exists also adaptive indexing. In this indexing, the color space is reduced to a fixed number of colors that most accurately represent the image. Not necessarily all colors needed by the image must be indexed, but they can be. The difference between indexing to standard/system palette and adaptive indexing is that adaptive indexing requires definitions of colors from the palette at the beginning

of the file and standard palettes do not have to be attached. [AL99]

File format is next characteristic of a digital image. A digital image can be stored in one of many file formats. Some formats are bounded with one specific program, but there are also common formats that are being understood by different graphic programs. There is a very close relation between file formats and compression. Images stored in a particular format are usually compressed in order to reduce the size of the file. Each format supports one or few compression methods; there are also formats that

store uncompressed data. [Dal92]

The last characteristic of a digital image is compression method used to reduce size of file containing the image.

Chapter 1. Digital Medical Imaging

7

(a) Human hand (from [Wei98]) (b) Human chest (from [Wei98])

(c) Human knee (from

[Pin04])

(d) Human skull (from [Pin04]) (e)

Human neck (from

[Pin04])

Figure 1.1. Examples of x-ray images.

As a conclusion, there is more than one method to reduce amount of disk space needed to store a digital image. The most obvious one is compression, but there are also other, simpler like reduction of image resolution. There can be also decreased number of colors or introduced index to used color palette.

1.2. Medical Images

Medical Imaging came into being in 1895 when W. K. Roentgen discovered X-rays.

This invention was a great step forward for non-invasive diagnostics and rewarded with Nobel Prize in 1901. With time, other discoveries in the field of medical imaging were made that, like X-rays, support medicine and make possible more accurate and effective diagnosis. It is not feasible to list all of discoveries and inventions. Similar situation is with describing all types of medical images. Thus, only the most important

Chapter 1. Digital Medical Imaging

8

Figure 1.2. Example of a Computerized Tomography image (from [Gon02])

discoveries will be mentioned with a characterization of images, which are products of these technologies.

Although X-rays were discovered over a century ago, they are still in common use. During examination, the patient is being placed between an X-ray source and a detector. Different tissues absorb x-rays with different force thus the X-rays that went through the patient have different energy depending on what tissues they ran into.

Dense tissues, e.g. bones, block and soft tissues give no resistance to the X-rays. Parts of the detector that are behind tissue that absorbs X-rays in 100% produce white areas on the image. The “softer” a tissue is the darker becomes the image in parts that represent this tissue. Many different X-ray detectors can be used during medical examination. They can be divided into two classes. One class contains detectors, like photographic plate, that give analog images. These images can be transformed into digital image by process of digitalization. —The second class of detectors consists of devices that directly produce digital images. The most familiar detectors that fall into the second class are Photostimulable Phosphors (PSPs), Direct Semiconductor

Detectors and combination of Scintillator with semiconductor detectors.

In 1971 G. Hounsfield build first Computerized Tomograph (Computerized Axial Tomograph) – an X-ray machine that produces a set of two-dimensional images

(slices), which represent and three-dimensional object. For his invention, G. Hounsfield was awarded with Nobel price in 1979. Pictures created during tomography are called tomograms and they create a specific class of X-ray images. There are also other classes, for example mammography images.

Apart from X-rays, also other technologies are used in medical imaging. Gamma-ray imaging is used in a field of nuclear medicine. In contrast to X-rays, here is no external source of gamma rays. A radioactive isotope, which emits gamma rays during decay, is administered to patient. Then the gamma radiation is measured with a gamma camera

(gamma-ray detectors). Most popular applications of gamma rays in medical diagnosis are bone scan and positron emission tomography (PET). Bone scan with gamma rays can detect and locate pathologies like cancer or infections. PET generates a sequence of images that, like in X-ray tomography, represent a 3-D object.

Medical imaging employs also radio waves. Magnetic resonance imaging (MRI) is a

Chapter 1. Digital Medical Imaging

9

(a) (b)

Figure 1.3. Examples of gamma-ray images (from [Gon02])

technique in which short pulses of radio waves penetrate through a patient. Each such pulse entails response pulse of radio waves generated by all tissues. Different tissues emit a pulse with different strength. The strength and source of the each response pulse is calculated and a 2-D image is created from all of gathered information.

Ultrasound imaging in medical diagnostic composes ultrasonography. An ultrasound system consists of a source, a receiver of ultrasound, a display and a computer.

High-frequency sound, from 1 to 5 MHz, is sent into the patient. Boundaries between tissues partially reflect the signal and partially allow it to pass. This means that the waves can be reflected on various depths. The ultrasound receiver detects each reflected signal and the computer calculates the distance between the receiver and the tissue, which boundary reflected the waves. Determined distances to tissues and strengths of reflected waves are presented on the display, i.e. they constitute a two-dimensional image. Such image typically contains information about millions of ultrasound signals and it is updated each second.

There are also other medical imaging techniques that were not described here. Most important of them are Optical Transmission and Transillumination Imaging, Positron

Emission Tomography (Nuclear Medicine), Optical Fluorescence Imaging, Electrical

Impedance Imaging.

The review of medical imaging techniques unveil large diversity of medical images classes and technology used in medical diagnosis. Nevertheless, all these images have some common characteristics.

All above-mentioned classes of medical images are chacterized with very restricted size. Although there are color medical images, the most of them are monochromatic.

Images from different classes have different sizes. Largest are the X-ray images, which can have size up to 2048 pixels vertically and horizontally. Other medical images are

Chapter 1. Digital Medical Imaging

10

(a) (b)

Figure 1.4. Examples of magnetic resonance images (from

normartmark.blox.pl

,

21.09.2007).

much smaller, for example Computerized Tomography are smaller than 512

×

512 pixels,

Magnetic Resonance images up to 256

×

256 pixels and USG images 700

×

500 or less.

[Sta04]

Medical images have also limited bit depth (how many bits are destined for description of single pixel color). X-ray images have bit depth equal 12 bit and USG images only 8 bits. The matter is not so clear with Magnetic Resonance images. Image format used here can store 2

16

(bit depth equal 16) tones of gray but, in fact, there are much fewer tones – about 2

9

(bit depth equal 9). [Sta04]

There are also other, more important issues, which distinguish medical images from other. Medical images create a particular class of digital images, where the information carried by them is extremely important. High fidelity of compression and any other processing is required or the diagnosis could be erroneous. The loss of information may mislead not only when a physician personally examines the image but also when software is used for analyzing the image.

The receiver operating characteristic (ROC) analysis is an evaluation method used to measure the quality and diagnostic accuracy of medical images. It is performed by trained observers who rate the perceptible loss of information. The analysis gives for different medical image types the maximal compression ratios at which the fidelity of the images meets the expectations of the observers. For sample image types, the ratios

are [Kof06, Oh03]:

Ultrasonography: 9 : 1

Chest radiography: 40 : 1, 50 : 1 – 80 : 1 (JPEG2000)

Computered Tomography: 9 : 1 (chest), 10 : 1 – 20 : 1(head)

Angiography: 6 : 1

Mammography: 25 : 1 (JPEG2000)

Brain MRI: 20 : 1

The information loss should be avoided during processing but also very important

Chapter 1. Digital Medical Imaging

11

(a)

Figure 1.5. Example of USG image.

is the quality of presentation of the image, especially the most important details. One should care about the faithfulness of image not only when it is presented in scale 1:1.

Due to small resolutions of medical images, their psychical size on a display device also will be rather small. Because of this, it is difficult to perform measurements by hand during diagnosing or even to read the image by a physician. Thus, magnification of the image is often very desirable and this means that also a zoomed-in image should be maximally true, legible and clear.

If it would be sure that images will not be magnified, probably the best choice for a compression method would be one of lossless methods. This group of compression techniques assures that no information will be lost during encoding and decoding processes; this means that the recovered image from a compressed file will be exactly the same as the original image.

The fractal compression has one large advantage over lossless methods – it enables fractal magnification that gives much better effects that traditional magnification algorithms, e.g. nearest neighbor, bilinear interpolation or even bicubic interpolation.

Fractal magnification is actually the same process as fractal compression – the image encoded with fractal method can be decompressed to arbitrary given size. An image compressed with one of lossless methods must be undergone to an interpolation algorithm if it has to be magnified. This means that although the compression algorithm did not cause any distortion to the image the interpolation algorithm will cause some faults. For example, there may appear block effect, image pixelization or image blurring.

Fractal compression makes possible to keep the distortion rate on much lower level and the image remains sharp regardless of the size to which it is magnified.

Fractal magnification is not the only quality of fractal compression. As opposed to

Chapter 1. Digital Medical Imaging

12 most of other compression methods, the fractal coding is asymmetric. From one hand, it is a drawback because encoding lasts much longer that in other methods. But at the same time it is an advantage because the decoding process is very fast – it takes usually less time to decode an image with fractal method than to read the same image, but uncompressed, from the hard drive. This feature is useful when the image must be sent through the Internet – the transmission time will be shorter because the image representation is shorter when is encoded with fractal method (lossy algorithm) than any lossless method, and there will be no significant additional time costs caused by decoding.

Another feature of fractal compression that attracts one’s attention is the greatness of compression ratios that can be achieved with this method. Since it is a lossy method, it gives much smaller compressed file than any lossless compression algorithm. However, the medical images cannot be compressed with too high compression ratio because the loss of information can turn out to bee too high.

Chapter 2

Image Compression

This chapter introduces the reader into the field of image compression (section 2.1)

and provides a general explanation of fractal compression (section 2.2). The readers

who are familiar with the topics may skip the entire chapter or parts of it.

2.1. Fundamentals of Image Compression

A compression method consists of definitions of two complex processes: compression and decompression.

Compression is a transformation of original data representation into different representation characterized by smaller number of bits. Opposite process – reconstruction of the original data set is called decompression.

There can be distinguished two types of compression: lossless and lossy. In lossless compression methods, the data set reconstructed during decompression is identical as the original data set. In lossy methods, the compression is irreversible – the reconstructed data set is only an approximation of the original image. At the cost of lower conformity between reconstructed and original data, better effectiveness of compression can be achieved. A lossy compression method is called “visually lossless” when the loss of information caused by compression-decompression is invisible for an observer (during presentation of image in normal conditions). However, the assessment, if a compression of an image is visually lossless, is highly subjective. Besides that, the visual difference between the original and decompressed images can become visible when observation circumstances change. In addition, the processing of the image, like image analysis, noise elimination, may reveal that the compression actually was not lossless.

There are many ways to calculate the effectiveness of the compression. The most often used factor for this purpose is compression ratio (

CR

), which expresses the ability of the compression method to reduce the amount of disk space needed to store the data.

CR

is defined as number of bits of the original image (

B org

) per one bit of the compressed image (

B comp

):

CR

=

B org

B comp

Chapter 2. Image Compression

14

The compression percentage (CP) serves the same purpose:

1

CP

= 1

CR

·

100%

Another measure of the compression effectiveness is bit rate (

BR

), which is equal to the average number of bits in compressed representation of the data per element

(symbol) in the original set of data. High effectiveness of a compression method manifests itself in high

CR

and

CP

, but in low

BR

. When time needed for compression is important must be used different factor – product of time and bit rate. Here were mentioned only the most commonly used factors but there are many more ways to estimate the effectiveness.

2.1.1. Lossless Compression

Most of lossless image compression methods are adapted universal compression techniques. Lossless compression converts an input sequence of symbols into an output sequence of codewords, so it is nothing else like a coding process. One codeword usually corresponds to one element (symbol) in the original data; in stream coders, it corresponds to a sequence of symbols. The codewords can have fixed or variable length. Decompression, of course, is decoding of the code sequence. The output of the decoding in lossless compression is the same as the input of the coding process. The division of the stream to be encoded into parts, which are bounded with codewords, is unequivocal.

Lossless compression method comprises of two phases – modeling and coding. Creation of a method boils down to specification how those two phases should be realized.

The modeling phase builds a model for the data to be encoded, which best describes information contained in this data. Choice of the modeling method for a particular compression technique depends to a large extent on the type of data to be compressed, but it always concentrates on recognition of the input sequence, its regularities and

similarities [Deo03]. The model is a different, simpler representation of the original

data that eliminates the redundancy [Prz02].

The coding phase is based on a statistical analysis and strives after the shortest

binary code for a sequence of symbols obtained from the modeling phase [Prz02]. In this phase the analytical tools from information theory are commonly used [Prz02].

Typically, entropy coding is used at this stage [Deo03].

Not all compression methods can be divided into these two stages. There are older

algorithms, like Ziv-Lempel algorithms, that escape from this classification. [Deo03]

Three groups are distinguished in lossless compression methods:

entropy-coding,

dictionary-based,

prediction methods.

In the first group – entropy coding methods, various compression techniques can be found in a great number, for example Shannon-Fao coding, Huffman coding, Golomb coding, Unary coding, Truncated binary coding, Elias coding. Within entropy coding methods also arithmetic methods can be found, e.g. range coding.

In dictionary-based methods are for example Lempel-Ziv-Welch (LZW) coding,

LZ77 and LZ78, Lempel-Ziv-Oberhumer algorithm, Lempel-Ziv-Markov algorithm.

Chapter 2. Image Compression

15

Prediction methods gained recently some popularity; an example can be JPEG-LS and Lossless JPEG2000 algorithms.

With lossless compression is bounded a limitation that is shown by information and coding theory. The average length of codeword cannot be smaller than the entropy

(expressed in bits) of the information source. So the closer a compression technique comes to this limit the better compression ratio can be achieved, and no lossless compression method can come beyond this limit. The basic concepts of information theory are explained below.

Information is a term that actually has no precise mathematical definition in information theory. It should be understand in colloquial way and treated as indefinable.

Information should not be confused with data (data build information) or message

(transmitted information). Although there is no definition, it is possible to measure information. The amount of information is calculated thanks to following equation:

1

I

(

u i

) = log

bs p i

where

p i

is the probability that the symbol

u i

will occur in the source of information.

This equation measures the information related with occurrence of a single symbol in a probabilistic source of information. The unit of this information measure depends on the basis

bs

of the logarithm. When

bs

= 2 then the unit is

bit

, when

bs

= 3 then the unit is

trit

, when

bs

=

e

(natural logarithm) then the unit is

nat

, and the last unit –

Hartley

is used when

bs

= 10.

Entropy is a different measure of information – it describes the amount of information specified by a stream of symbols. According to Shannon definition, the entropy is the average amount of information

I

(

u i

) for all symbols

u i

So when data

U

calculated from:

=

u

1

, u

2

, . . . , u

U

that build the stream.

constitute the information then the entropy can be

H

(

U

) =

U

X

p

(

u i

)

·

I

(

u i

) =

i

=1

U

X

p

(

u i

)

·

log

bs i

=1

1

p

(

u i

)

=

U

X

p

(

u i

)

·

log

bs i

=1

p

(

u i

)

Above-mentioned formulas are correct only when emission of a symbol by the source is independent from past symbols – i.e. when the source is memoryless source. Other types of sources, e.g. source with memory or finite-state machine sources, like Markov source, require consideration of changes in these formulas.

2.1.2. Lossy Compression

The limitation of the effectiveness of lossless compression techniques brought about demand for different approach to compression, which will give better compression ratios. Better effectiveness can be achieved only by disposing of the reversible character of the encoding process. The lossy compression methods reduce the information of the image to be encoded up to some level that is acceptable by a particular application field. This means that, apart from characteristics of a compression method known from lossless techniques – compression ratio and time needed for encoding and decoding, in lossy methods occurs one more – distortion rate. By distortion rate, one should understand the distance between original image and the image reconstructed in decoding process.

Chapter 2. Image Compression

16

Figure 2.1. General scheme for lossy compression.

In lossy compression algorithms, two obligatory phases can be distinguished: quantization and lossless compression. This means that the quantization is the key issue for lossy methods. Before the quantization, one more phase can be found -– decomposition, which is optional, but very frequently used because it allows one to create more effective quantization algorithms.

The goal of the decomposition is to build a representation of the original data that will enable more effective quantization and encoding phases. Basic way to achieve this goal is to reduce the length of the representation comparing to the original data. Although the decomposition phase is optional, it exists in every practical implementation of lossy compression. Before the quantization will proceed, decomposition reduces the redundancy and correlation of symbols (pixel values) in the stream to be encoded. A combination of decomposition with simple quantization produces results in very good effectiveness with much lower complexity and encoding/decoding time.

There are many different ways to perform the decomposition, the most popular are:

frequency transforms,

wavelet transforms,

fractal transforms.

The quantization reduces the number of symbols of the alphabet, which will be used by the intermediary representation of the stream to be encoded. This means that the information carried by the image is partially lost in this phase. Compression methods often allow adjusting the level of information loss – when the entropy is lower than the length of the encoded stream is smaller. Thus, the decomposition is the most important phase in all practical realizations of lossy compression because it determines the compression ratio, quality of the recovered image and size of information loss during encoding.

The quantization in lossy compression techniques can be compared to digitization of an analog image, where for a set of continuous values a representation with some limited number of discrete levels of quantization. When it comes to compression, the image is already represented with the set of discrete values, but it is being replaced with smaller set that best keeps the information of the original image. The dequantization is a process opposite to quantization, where the original stream is being reconstructed

Chapter 2. Image Compression

17 based on the levels of quantization. The dequantization is inseparably bounded with approximation of the value because the reconstruction of all symbols encoded with lossy compression is impossible.

Two types of quantization are used n lossy compression methods – Scalar Quantization and Vector Quantization. Difference between these two types is what the elementary unit of symbols for processing is. In scalar quantization, this unit is equivalent of single symbol. While in vector quantization, it consists of some number of successive symbols – a vector of symbols. Both of these methods can employ regular or irregular length of intervals.

Figure 2.2. Regular scalar quantization.

The quantization can be executed in an adaptive manner. The adaptation can go forward or backward. In forward adaptation, the input stream is divided into pieces, which have similar statistical characteristics, e.g. variance. For each one of these pieces a quantizator is being built separately. This method results in better quantization of the entire input stream with cost of greater computing complexity and enlargement of the size of description of the quantizator attached to the encoded stream.

The backward method of adaptive quantization builds the quantizator based on data already processed during the quantization process. This method does not require any additional information about the quantization to be attached to the encoded stream.

The last phase of lossy compression methods is de facto a complete lossless compression method to which the output of quantization is passed as the input stream to be encoded. A large variety of lossless methods is used in different lossy compression methods. Any type of lossless method can be used here, but it must be chosen with respect to the decomposition and quantization techniques.

Any phase of above-described scheme can be static or adaptive. Adaptive version usually leads to increased effectiveness with the cost of higher complexity of the algorithm.

As it was mentioned at the beginning of this section, compression ratio in lossy techniques is not limited by the entropy of the original stream. The entropy of the encoded stream can be reduced if higher compression ratio is required. Nevertheless, decreased entropy entails higher distortion. Very helpful is here rate distortion theory

Chapter 2. Image Compression

18

Figure 2.3. Compression system model in rate-distortion theory which answers the question what is the minimal entropy within the encoded stream that will be enough to reconstruct the original image without exceeding a given distortion level. Notation, which will be used to explain the rate distortion theory, is explained

on figure 2.3. In the figure bit rate is marked with

R

. This theory allows determining what the boundaries of compression ratio in lossy compression methods are. According to rate distortion theory the bit rate

BR

(average bit length per symbol) is related with distortion by following dependency:

BR

(

D max

) = min

d

(

X,

e

) n

I

(

X,

f

) o

The

I m

in above equation means “mutual information”, it is the average information that random variables (here

X,

f

) convey about each other:

I m

(

X,

f

) =

H

(

X

)

H

(

X

|

f

) =

H

( f

)

H

(

X

|

X

)

=

X

X

x i

e

X

f

X,

e

(

x i

, x i

e

x

)

·

log

f

X,

e

(

x i

, x f

X

(

x i

)

· f

)

(

x i

)

=

X X

f

X

(

x i

)

· f

e

|

X

(

x x i x i

e

, x i

)

·

log

f

e

|

X

(

x f

, x i

)

(

x i

)

The random variable

X

describes the original data set and the f represents the reconstructed data set. The

f

X

(

x i

) represents the occurrence probability of a determined symbol. The

f

e

|

X

(

x , x i

) is the conditional probability that given symbol will occur in source f under condition that some symbol will occur in source

X

. Values

f

X

(

x i

) are defined by the statistics of the information source but the values

f

characterize the compression method.

e

|

X

(

x , x i

)

The mutual information has following properties:

0

¬

I

(

X

; f

)

I m

( f

;

X

)

I m

(

X

; f

)

¬

H

(

X

)

I m

(

X

; f

)

¬

H

(

X

)

The distortion per symbol can be measured with Hamming distance or other measure, e.g.:

d

(

x i

, x i

) = (

x i

− x i

)

2 or

d

(

x i

, x i

) =

| x i

− x i

|

. Independently from the measure that will be chosen the distortion

d

has fallowing properties:

d

(

x i

, x

)

­

0

d

(

x i

, x

) = 0

when x i

=

x

Chapter 2. Image Compression

19

The value

D

expresses the average distortion for an image and it is expressed with the equation:

D

(

X,

f

) =

E

n

d

(

X,

f

) o

=

X

x i

X

f

X,

e

(

x i

, x

e

i x

)

· d

(

x i

, x

)

The formulas presented above state that, under the criterion that the average distortion will be not greater than the given value

D max

, the minimal bit rate is equal to the greatest lower bound of the average mutual information. To find such compression method, characterized by the

f

e

|

X

(

x i

, x i

), one has to minimize the amount of information about random variable

X

carried by random variable f for given distortion level

D

not greater than

D max

.

The relationship between bit rate and distortion level is visualized on figure 2.4.

Figure 2.4. The relationship between bit rate and distortion in lossy compression.

There are many ways to measure the quality of the reconstructed image, obtained with a given compression method. Probably, the two most popular measures are mean squared error (MSE) and peak signal to noise ratio (PSNR), which are defined by following formulas:

1

M SE

=

M

·

N

M −

1

X

N −

1

X h

X

(

m, n

)

f

(

m, n

) i

2

m

=0

n

=0

P SN R

= 10

·

log

10

max

2

X

M SE

!

where

X

is the original image, f

– the reconstructed image,

M

– number of pixels in a row,

N

– number of pixels in a column,

max

X

= 2

bit d

1 – the maximal possible pixel value of the image X (

bit d

– bit depth). PSNR is expressed in decibels (

dB

).

2.2. Fractal Compression

Fractal compression methods, which belong to lossy methods, distinguish themselves from other techniques by a very innovative theory. To some extend, fractal compression diverges from the described above basic scheme of lossy compression methods.

Chapter 2. Image Compression

20

(a) The picture of Lenna.

(b) Selfsimilarity in Lenna’s picture.

Figure 2.5. Selfsimilarity in real images (from

einstein.informatik.uni-oldenburg.de

/rechnernetze/fraktal.htm

, 19.01.2008).

The most important part of this theory is that parts of an image are approximated by different parts of this image (the image is self-similar). This assumption makes possible to treat the image as a fractal.

According to B. Mandelbrot [Man83], the “father of fractals”, a fractal is

A rough or fragmented geometric shape that can be subdivided in parts, each of which is (at least approximately) a reduced/size copy of the whole.

Fractal is a geometric figure with infinite resolution and some characteristic features.

First of them is already mentioned self-similarity. Another one is fact that fractals are described with a simple recursive definition and, at the same time, it is not possible to describe them with traditional Euclidean geometry language – they are too complex.

As a consequence of the self-similarity of fractals, the fractals are scale independent – change of size causes generation of new details. The fractals have plenty of other very interesting. Nevertheless, they are not necessary to understand fractal compression theory and they will not be explained here.

The essence of fractal compression is to find a recursive description of a fractal that is very similar to the image to be compressed. The distance between the image generated from this description and the original image shows how large information loss is. Although fractal compression is based on an assumption that the image can be treated a fractal, there are some divergence from above-presented fragments of fractal theory. In fractal compression self-similarity of the image is loosen – it is assumed that parts of the image are similar to other parts and not to whole image.

All other properties of fractals remain valid for an image encoded with a fractal compression method. The image can be generated in any size, smaller or larger than the original. Quality of reconstructed image will be the same in all sizes, and edges always will have same sharpness. The number of details can be adjusted by changing the number of iterations for the recursive description of the image.

Chapter 2. Image Compression

21

The fractal theory says that the recursive description of complex shape shall be simple. Any photographic-like image is very complex and if this image can be described as a fractal then a great compression ratio shall be achieved.

The fractal description of an image consists of a system of affine transformations.

This system is called fractal operator and has to be convergent.

2.2.1. Decoding

Iterated Function System

Fractal compression is based on IFS (Iterated Function Systems) – one of many ways to draw fractals. The IFS uses contractive affine transformations.

By a transformation, one should understand an operation that changes the position of points belonging to the image. If the space of the digital images will be marked with

F

and a metric with

d

then the pair (

F, d

) constitutes a complete metric space. Nonempty compact subsets of

F

are points of the space. In this space, a transformation means a function

w

:

F

F

.

A transformation

w

is contractive when the function satisfies the Lipschitz condition, i.e. for any

x, y

F

there is a real number 0

< λ <

1 that

d

(

w

(

x

)

, w

(

y

))

¬

λd

(

x, y

), where

d

(

x, y

) denotes the distance between points

x

and

y

.

A transformation is affine when it preserves certain properties of geometric objects exposed to this transformation. The constrains, which make a transformation affine, are:

preservation of collinearity – lines are transformed into lines, the images of points

(three or more) that lie on a line are also collinear

preservation of the radios of distances between collinear points – if points

p

1

, p

2

, p

3 are collinear then

d

(

p

2

, p

1

)

d

(

w

(

p

2

)

, w

(

p

1

))

=

d

(

p

3

, p

2

)

d

(

w

(

p

3

)

, w

(

p

2

))

Affine transformations are combinations of three basic transformations:

shear (enables rotation and reflection)

translation (movement of a shape)

scaling/dilation (changing the size of a shape)

A single transformation may be described with following equation:

w i

"

x y

#

=

"

a i b c i d i i

# "

x y

#

+

"

e i f i

#

The coefficients

a, d

determine the dilation and the coefficients

b, c

determine the shear,

e

and

f

specify the translation. The variables

x, y

specify the coordinates of a point

(pixel) that currently is being transformed.

To generate a fractal, several transformations are needed. These transformations form fractal operator

W

, often called Hutchinson’s operator:

W

=

W

[

w i i

=1

An Iterated Functions Systems is defined by complete metric space (

F, d

) and operator

W

. The Banach fixed point theorem guarantees that in complete metric space

Chapter 2. Image Compression

22

Figure 2.6. Generation of the Sierpinski triangle. Four first iterations and the attractor.

F

, the operator

W

has a fixed point

A

, called the attractor:

W

(

A

) =

A

, which can be reached from any starting image through iterations of

W

. The images produced in iterations are successive approximations of the attractor.

A

= lim

i →∞

W

◦ i

(

X

0

)

, where W

◦ i

=

W

W

◦ i

1

and X

0

F

Thus the fixed point of a fractal described with IFS can be found through a recursive algorithm. An arbitrary image

X

0

(

X

0

F

) is put on the input, and processed with a given number of iterations. In each iteration, the whole output image from previous iteration is undergone to all transformations in the operator (Deterministic IFS):

X r

=

W

(

X r

1

) =

W

[

w i i

=1 where

X r

means the image produced in iteration

r

or the initial image when

r

= 0.

There is a second version of this algorithm in which at the beginning a starting point

X i

0

(

X i

0

X

0

) is picked. In each iteration, a randomly chosen transformation is applied to a point from previous iteration (Random IFS).

In figure 2.6, several iterations of the deterministic IFS are shown. In each picture,

the dashed square contains the image that will be found on input in the next iteration.

The squares with solid lines represent the transformations – the image from previous

Chapter 2. Image Compression

23 iteration is rescaled and moved to fit the square. The Sierpiński triangle is described with only three transformations:

"

"

x y

"

x

0 y

0 x

0 y

0

0

0

#

#

=

"

0

.

5 0

0 0

.

5

# "

x y

#

+

"

0

.

25

0

.

5

#

#

=

"

0

.

5 0

0 0

.

5

# "

=

"

0

.

5 0

0 0

.

5

# "

x y

#

+

"

x y

#

+

"

0

0

#

0

.

5

0

#

The first transformation is related with the bottom left square, the second with the top square, and the last with the bottom right one.

Iterated Functions Systems allow constructing very interesting fractals, e.g. Koch curve, Heighway dragon curve, Cantor set, Sierpinski triangle, and Menger sponge.

Some fractals, which can be drawn with IFS, quite well imitate nature, e.g. Barnsley

fern. The Barnsley fern (see figure 2.7) is described by an operator with four transfor-

mations:

"

x

0 y

0

"

x

0 y

0

#

=

"

0

.

85 0

.

04

0

.

04 0

.

85

# "

x y

#

+

"

#

=

"

0

.

15 0

.

28

0

.

26 0

.

24

# "

x y

#

+

"

0

1

.

6

#

0

0

.

44

#

"

x y

"

x

0 y

0

0

0

#

=

"

0

.

2

0

.

26

0

.

23 0

.

22

# "

#

=

"

0 0

0 0

.

16

# "

x y

#

x y

#

+

"

+

"

0

0

#

0

1

.

6

#

Figure 2.7. Barnsley fern

Chapter 2. Image Compression

24

Partitioned Iterated Function System

Fractal compression uses PIFS (Partitioned Iterated Function System) which is a modified version of IFS. In IFS, one could specify the operator by the number of affine transformations and the set of coefficients in Hutchinson’s operator. In PIFS, the operator includes two additional coefficients for each transformation that determine the contrast and brightness of the images generated by the transformations. The most important difference between IFS and PIFS is that in IFS all transformations take the whole image from previous iteration on input, in PIFS it is possible to specify what part of the image should be processed. Transformations can take on input different parts of the image. These two additional features give enough power to decode grayscale images from a description of the image consisting of the fractal operator.

The fragment of the space that is put on input of a transformation is called domain.

Each transformation in PIFS has its own domain

D i

and transforms it into range

R i

.

Equivalent of Hutchinson’s matrix in IFS is in PIFS the following system:

w i

x y z

=

a i b i

0

c i d i

0

0 0

s i

 

x y z

 

+

e i f i o i

In PIFS, the

z

variable is the brightness function for given domain (for each pair

x

,

y

there is exactly one value of brightness):

z

=

f

(

x, y

)

Two new coefficients in PIFS are introduced to operate on the

z

variable:

s i

the contrast and

o i

the brightness.

specifies

2.2.2. Encoding

As it was already mentioned, the fractal code of an image contains a fractal operator.

PIFS solves the problem of decompression of an image, but the compression is related with the inverse problem – the problem of finding operator for given attractor.

The first solution to the inverse problem was developed by Michael F. Barnsley.

The basis of his method is the

collage theorem

. The inverse problem is solved here approximately – the theorem states that one should concentrate on finding operator

W

that generates an attractor

A

that is close to the given attractor

X

(i.e. to the image to be encoded):

X

A

W

(

A

) =

w

1

(

A

)

∪ w

2

(

A

)

. . .

∪ w

W

(

A

) where

X

is the image to be encoded,

W

is the operator and

A

the attractor of

W

.

Thus the goal is to find a fractal operator consisting of transformations

w i

that will represent an approximation of a given image. The theorem gives information that is more specific about the distance between the original image and the attractor generated from found IFS:

δ

(

X, A

)

¬

δ

(

W

(

X

)

, X

)

1

− s

Chapter 2. Image Compression

25 where

d

is the distance measure,

s

is the contractivity factor of

W

and 0

< s <

1

According to this equation, the closer is the collage

W

(

X

) (first-order approximation of the fixed point) to the original image

X

, the better is the found IFS – the attractor

A

is closer to the original image

X

. So during the encoding process one can focus on minimizing the distance

δ

(

W

(

X

)

, X

) and this will result in minimizing the distortion

δ

(

X, A

), which is the goal of fractal compression. The quantitive distance measure

δ

(

W

(

X

)

, X

) is called the

collage error

. The computational complexity of fractal compression is significantly reduced by minimization of the collage error instead of the distance between the original image and the attractor. However, this solution does not give optimal results.

The distance

δ

(

X, A

) between the original image and the attractor is also influenced by the contractivity factor – if

s

is smaller then the images are closer to each other.

However, minimizing the

s

has also other effects. The smaller

s

is the larger the fractal operator is – more transformations are needed to encode the image.

Thus, one has to find all ranges and domains and to specify the transformations.

The distances between all ranges

R i

R

and corresponding domain blocks

D i

give the collage error

δ

(

W

(

X

)

, X

) thus they determine the accuracy of the compression. Thus, the size of information loss during encoding can be reduced by pairing closer ranges and domains into transformations. The process of finding proper range and domain is very complex in computationally sense, so the computing time is also long. Improving the quality of encoding extends the process even more.

The first fully automatic method for fractal compression was presented by

Jacquin [Jac93]. The key problem is to find a set of non-overlapping ranges

R i

that covers the whole image to be encoded; each range must be related with a domain. The distance between

R i

has to be minimal – there should be no other domain that is closer to

R i

. The draft of the encoding algorithm may look like this: and corresponding

D i

1

2 divide the image into overlapping domains

D

=

{

D

1

, D

2

, . . . , D m

}

and disjoint ranges

R

=

{

R

1

, R

2

, . . . , R n

}

// the size of each range is b

× b and the size of each domain is

2

b

×

2

b .

for each range

R i

R

:

2. a

2. b set

w i

:=

NULL

,

D i

:=

NULL

,

j

:= 1 for each domain

D j

D

:

2. b. i compare

R i

with 8 transformations of

D j

// transformations: rotation of D j by 0, 90, 180, 270 degrees and rotation

2. c

of the reflection of D j by 0, 90, 180, 270 degrees

2. b. ii determine parameters of transformation

w j i

that gives minimal distance between

2. b. iii

2. b. iv calculate

δ

(

w j i

if

δ

(

w i

(

D i

)

, R i

)

> δ

(

w j

// i.e. if

then

w i

0

< k < j

:=

w j i

,

D i

(

D

:

j

)

, R i

δ

:=

D j

(

i

(

D w i k j

(

D k

)

, R i

)

> δ

(

R w i

and

w j i

)

, R i

) or

w i

=

NULL

,

j i

(

D j

)

, R i

)

(

)

- distance between

w j i

D

(

j

D

)

j

) and

R i

add

w i

to fractal code

The above-presented algorithm became a basis for many fractal compression methods. All other methods can be treated as improvements to Jacquin’s method. The result

Chapter 2. Image Compression

26 of fractal encoding is the fractal code, which consists only of parameters of the fractal operator’s transformations.

2.3. Fractal Magnification

Figure 2.8. Fractal magnifier block diagram

The fractal magnification (or resolution improvement) is simply the process of encoding and decoding an image with partitioned iterated functions systems. The transformations that build the fractal code describe relations between different parts of the image and no information about the size or resolution of the image is being stored.

Thus, the fractal code is independent of the resolution of the original image. At the same time, the fractal operator stored within the fractal code drives to an attractor that is only an approximation of the original image but it has continuous tone. This means that the image can be decoded to any resolution – higher or lower than the original resolution. Resolution improvement is here equivalent to fractal magnification.

A display device has a fixed size of the pixels. Higher resolution means that the image can be displayed on a higher number of pixels, thus the physical dimension of the displayed image is higher than the original’s.

Figure 2.9. Fractal magnification process

When an image is being magnified, the new details are generated during the decompression. Thus, there is no problem with the values of the pixels that do not exist in the original image. Image interpolation, the most popular technique to zoom images, has to calculate the values of the pixels that were inserted between original image’s pixels. There are different interpolation methods, e.g. nearest neighbor, linear or bicubic

Chapter 2. Image Compression

27 interpolation. These classical methods of image enlargement are inseparably bounded with some image distortions. For example the image pixelization (large pixels) may appear, i.e. the borders between groups of pixels, which represent one pixel of the original image, are much visible.

Chapter 3

Fractal Compression Methods

All fractal compression methods originate from the same ancestor, which was briefly described in previous chapter. Because of this, there are an appreciable number of similarities between the methods. Thus, different fractal methods will not be described from the beginning to the end since much of the content would be repeated several times. Instead, differences between the compression methods will be discussed. One has to keep in mind that many different fractal methods, elaborated by different authors, may implement some element in the same manner. The elements of the fractal compression algorithm that vary among different methods are grouped into several categories. Each section in this chapter corresponds to one such category.

3.1. Partitioning Methods

The partitioning scheme used to demarcate the range blocks is one of the most crucial elements of the fractal compression method. The fidelity and quality of the reconstructed image, the length and the structure of the fractal code, the shape of the transformations used to map domains into ranges and their descriptions in the fractal, code compression ratio, encoding time and all other important characteristics of the compression method are somehow influenced by the choice of the partitioning method.

For example, only when uniform partitioning is used, there is no need to attach any information about the partition to the fractal code – only transformation coefficients are stored. At the same time, there are partitioning methods that consume even about

44% of the fractal code to describe the partition [Har00]. Of course, there are plenty

of methods that are between these two cases – e.g. quadtree partitioning takes about

3

.

5% of the total code size to define the partition [Har00]. Surprisingly, this “addi-

tional” information does not have negative effect on the rate-distortion performance – from the three mentioned methods, best results gives the one that needs the largest space to specify the partition, and the weakest is the uniform partitioning. This is because the partitioning scheme has also impact on the number of transformations – the Hartenstein’s method produces only few but large range regions what cannot be achieved in two remaining methods.

Chapter 3. Fractal Compression Methods

29

The partitioning methods, including the tree mentioned above, are described in more details in the following subsections. Notes on their performance can be also found there.

3.1.1. Uniform Partitioning

The uniform partitioning method is presented in section 2.2.2 but it is not the

only option in fractal compression. Actually, it is the most basic solution. The uniform partitioning is image-independent because the ranges and domains have fixed size – usually the size of a range is 8

×

8 (this means that the size of each domain is 16

×

16).

This partitioning method has some serious drawbacks. Firstly, details smaller than the size of the range may be found in the image. That sort of details will be lost during encoding because it would be hard to find a domain with exactly the same details.

Of course, there will be found a domain for each range but another problem occurs here – there is no certainty that the distance between these two squares will be really small. The size of the ranges can be adjusted to minimize the problem of matching ranges and domains. However, the smaller the ranges are the worse the compression ratio is because the transformations have to be found for larger number of ranges. At the same time, some parts of the image could be covered with larger ranges and the loss of information will be still on acceptable level. This would result in lower number of transformation, thus, better compression ratio would be achieved.

3.1.2. Overlapped Range Blocks

This method, which is a modification of partitioning into squares, was created by

Polidori and Dugelay [Pol01]. The method is very similar to uniform partitioning – all

ranges have same size

b

× b

and domains 2

b

×

2

b

. The difference is that the ranges are not disjunctive but mutually overlapping with half of their size. This means that all pixels belong to more than one range – pixels close to the edge of the image belong to two ranges and the rest of the pixels are within four ranges. Partitions are encoded independently and decoding gives up to four different values for each pixel. From these four approximations, the final pixel value is calculated.

This method gives much better results than pure “squares”, e.g. effectively reduces the block effect. However, there are also shortcomings of the method. It is much more time consuming – the image is actually four times encoded and four times decoded during each encoding-decoding process. In addition, the fractal code representing the image is almost four times longer. At the same time, the risk of losing small details is not dismissed.

3.1.3. Hierarchical Approaches

Hierarchical approaches to image partitioning constitute the first class of image-adaptive techniques. The decomposition of the image depends on the content

– parts of the image with condensed details are divided into smaller ranges and flat regions into large ones. This feature makes possible to overcome the limitations of fixed size (uniform) partitions scheme. There are two types of hierarchical approaches: top-down and bottom-up.

Chapter 3. Fractal Compression Methods

30

In the top-down approaches, the whole image is treated as a single range at the beginning of the encoding (or it is divided into large uniform partitions). If it is not possible to find a domain that is close enough (error criterion) to a range then the range is being split into several ranges (the number depends on the method).

The bottom-up approaches start with an image divided into small ranges that assure low level of information loss. During later phase of partitioning, the neighbor ranges that are close enough to each other are being merged and thanks to that, the final ranges can have different size.

Quadtree Partitioning

The quadtree partitioning presented by Yuval Fisher [Fis92c] was the first hierar-

chical approach to partitioning. All ranges have here the shape of a square.

In this method, the set D of domains contains all square ranges with sides’ size

8, 12, 16, 24, 32, 48 and 64. Here can be admitted also domains situated slantwise in order to improve the quality of the encoded image. In the top-down algorithm, the whole image is divided into fixed size (32

×

32 pixels) rangesat the beginning.

Then for each range, the algorithm tries to find a domain (larger than the range) that gives the collage error smaller than some preliminary set threshold. If this attempt ends with failure for some ranges then each such range is divided into four. For all newly created ranges the procedure is repeated, i.e. fitting domains are being searched for ranges and, if necessary, the non-covered ranges are being broken down. The encoding ends when there are no ranges that remain uncovered or the size of the ranges reaches a given threshold. In the second case, the smallest ranges are paired with domains that do not meet the collage error requirement but are closest to corresponding ranges.

Figure 3.1. Quadtree partitioning of a range. Four iterations.

Besides allowing the slantwise domains, there are also other improvements to the method. The adaptivity of the division can be increased by introduction of unions of quadrants created during division of a range.

The main drawback of the quadtree partitioning method is that all ranges are divided in the same way, independently from the content of the ranges. The size of ranges and the structure of partitioning are adaptive to the whole image but the act of braking down a single range produces always the same output – four quadrants of the input range. The partitioning would be better fitted to the content of the image if the partitioning process was adaptive also at the stage of drawing the borders of future regions during range block division. Theoretically, this improvement would result in larger range blocks, i.e. in less number of transformations.

Chapter 3. Fractal Compression Methods

31

Horizontal-Vertical Partitioning

In the horizontal-vertical (HV) partitioning method [Fis95c], the shape of a range

can be not only a square but also any other rectangular because a range (when there is no domain close enough to it) is divided into two rectangles instead into four squares.

The frontier between the two rectangles is established in the most significant horizontal or vertical edge. Thus, this method is an answer to disadvantages of quadtree partitioning – it tries to find best division of a range into two new ranges by horizontal or vertical cut.

The image is partitioned in this manner from the beginning, i.e. there is no initial phase in which the image is divided into uniform partitions (like in quadtree partitioning). The algorithm includes also mechanisms preventing from degeneration of rectangles.

The algorithm uses two formulas (

v n

and the position of the cut: and

h m

) that allow determining the direction

v m

= min(

m, width

(

R i

)

1

− m

)

width

(

R i

)

·

height

(

R i

)

1

X

r m,n n

=0

height

(

R i

)

1

X

r m

+1

,n

n

=0

h n

= min(

n, height

(

R i

)

1

− n

)

height

(

R i

)

·

width

(

R i

)

1

X

r m,n m

=0

width

(

R i

)

1

X

r m,n

+1

m

where

width

(

R i

)

× height

(

R i

) is the dimension of the range block

R i width

(

R i

), 1

¬ n < height

(

R i

).

and 1

¬ m <

( P

The second factors of these formulas, ( P

n m r m,n

P

m r m,n

+1

r m,n

P

n r m

+1

,n

) and

), give the difference of pixel intensity between adjacent columns (

v m

and

v m

+1

) and rows (

h n

and

h n

+1

). Maximal values of these differences point out the most distinctive horizontal and vertical lines.

The first factors, [min(

m, width

(

R i

)

1

− m

)]

/width

(

R i

) and

[min(

n, height

(

R i

)

1

− n

)]

/height

(

R i

), ensure that the rectangles created by splitting the range block will not be too narrow – the closer a possible cutting line location is to the middle of the range block, the more privileged it is.

At this point, we have two alternative lines along which the split can be done – one vertical and one horizontal. The HV partitioning allows cutting along only one of them:

if max(

h

0

, h

1

, . . . , h height

(

R i

)

1

)

­

max(

v

0

, v

1

, . . . , v width

(

R i

)

1

) then the range block is partitioned horizontally

otherwise, the range block is partitioned vertically

In other words, the more distinctive cutting line is chosen from the two alternatives.

The increased adaptivity is paid dearly with increased time complexity (due to the variety of range shapes and additional computations) and longer description of the partitions. However, these additional costs pay off – the rate distortion is significantly improved comparing to quadtree partitioning method. This superiority is caused by better adaptivity and larger range block sizes (i.e. lower number of range blocks).

Triangular Partitioning

Next partitioning method [Fis92a] is based on triangles. In the first step, the rect-

angular image is divided into two triangles along one of diagonals. At this point, the

Chapter 3. Fractal Compression Methods

32

Figure 3.2. Horizontal-vertical partitioning of a range. Four iterations.

recursive algorithm begins. Each triangle, for which no suitable domain can be found, is divided into four triangular ranges. The borders between these triangles are drawn between three points that lie on three diverse sides of the range to be divided. The points that define the borders can be freely chosen in order to optimize the division and minimize the depth of the tree representing the partitioning, i.e. the number of transformations.

There was also elaborated a second triangular partitioning scheme, in which the triangular range is divided along a line from a vertex of this triangle to a point on the

opposite side [Nov93].

The triangular partitioning has several advantages over HV partitioning. First of them is the fact that distortions caused by not ideal matching of the ranges and domain are less noticeable. The second very significant advantage is possibility of occurrence of rotation angles within the transformations other than multiple of right-angle. This is because the triangular ranges can have any orientation and rectangular ranges (HV, quadtree, fixed size partitioning) can lie only horizontally or vertically. The largest advantage of triangular partitioning is reduction of the block effect, which can be observed in uniform partitioning.

Nevertheless, this partitioning scheme has also some heavy drawbacks. The comparison of a domain block with a range block is hampered because of the difficulties with interpolation of the domain block when the pixels from these two blocks cannot be mapped one-to-one. This problem occurs in all partitioning schemes that are not based on right-angled blocks and is the reason why the right-angled methods are superior

[Woh99].

Polygonal Partitioning

The polygonal partitioning is very similar to horizontal-vertical but is more adaptive

to the image. It was invented by Xiaolin Wu and Chengfu Yao [Wu91] but Reusens

was the one who applied it to fractal image compression [Reu94]. In this method, a

range can be divided horizontally, vertically (like in HV) or along a line inclined by 45 or 135 degrees.

Other method to get polygonal blocks is the modified Delaunay triangulation method – in the merging phase of this method, not only triangles can be created

but also quadrilaterals [Dav95]. However, this method belongs to the second group of

partitioning schemes – the split-and-merge approaches.

Chapter 3. Fractal Compression Methods

33

3.1.4. Split-and-Merge Approaches

The hierarchical approaches perform the partitioning while the pairs of ranges and domains are being found. The split-and-merge approaches divide the image into partitions before the searching for transformations is started. The partitioning process consists here of two phases. The first phase – the splitting yields a fine uniform partitioning or a partitioning with various density of ranges for different parts of the image.

The second phase – the merging combines neighboring ranges with similar mean gray levels.

Delaunay Triangulation

Delaunay triangulation was adapted to fractal coding by Davoine and Chassery

[Dav94, Dav96]. In this method, the partitioning results in a set of non-overlapping

triangles that cover whole image. The splitting phase starts by dividing the image into regular, fixed size triangles. This triangulation is represented by regularly distributed points, which are equal to triangles’ vertices. Then the triangles are investigated and if any triangle is not homogeneous in sense of variance or gradient criteria then a point is added in the barycenter of the triangle. The splitting is recursively repeated until all triangles are homogeneous or the non-homogeneous triangles are smaller than a given threshold. Before each iteration, the triangles must be recalculated based on the set of points.

The merging removes certain vertices and by this action, the triangles are combined. A vertex is removed if all triangles to which it belongs have similar mean gray levels. Each single change of the set of vertices entails the necessity of recomputing the triangulation before following actions are performed.

The Delaunay triangulation has the same main advantages as the triangular hierarchical partitioning – related with unconstrained orientation of triangles. However, the number of transformations determined with Delaunay triangulation is lower than in hierarchical approaches.

The triangles can be merged not only to larger triangles but also to quadrilaterals

[Dav95]. This increases the compression ratio because the number of transformations is

smaller in such case. When the basic Delaunay partitioning and the enhanced scheme result in similar compression ratio then the quality of the reconstructed image is better in the quadrilateral approach.

Irregular Regions

The methods that produce irregular shaped range regions realize the splitting simply by utilizing the existing simple partitioning methods. The uniform partitions were employed in first algorithm based on irregular regions created by Thomas and Deravi

[Tho95] but also in the work of other researchers [Sau96d, Tan96, Ruh97]. The quadtree

partitioning was introduced to irregular partitioning by Chang [Cha97, Cha00]; Ochotta

and Saupe also used this schema [Och04].

The small squares from first phase are merged to form larger squares or irregular range blocks. This partitioning scheme adapts very well to the content of the image, which is being encoded.

However, there are problems with concise description of the regions’ boundaries.

There are two main approaches to this issue: chain codes and region edge maps. The

Chapter 3. Fractal Compression Methods

34

(a) Cells with attached symbols

(b) Context for the symbol X (c) Region edge map example

Figure 3.3. Region edge maps and context modeling chain coding describes the path that agrees with the boundaries. To specify this path a starting point and a sequence of symbols representing steps (simple actions: go straight, turn left, and turn right) must be stored in the fractal code. The length of the step is equal to the length of the side of region block in uniform partitioning, and it is equal to the length of the side of the smallest region block in the quadtree. The most basic version of chain coding encodes each closed region boundary into one path with specified starting position. The performance of such approach leaves much to be desired because redundant information is present since almost all of the boundaries are sheared by two regions.

The region edge map [Tat92] utilizes a grid of squares. If uniform partitioning was

used in splitting phase then the grid is equal to these partitions. If quadtree partitioning was used then the cells in the grid have the same size as the smallest ranges – any range

(of quadtree partitioning) can be either a union of cells or a single cell. Each single cell is provided with one of four symbols that indicate whether (and where) there is a range boundary at the edge of the cell; the symbol is stored in two bits. There are only four instances considered:

1. no range boundary

2. boundary on the North edge

3. boundary on the West edge

4. boundary on the North and on the West edges

The region edge maps can be efficiently encoded with an adaptive arithmetic coding and context modeling. The context is build of four cells processed (encoded or decoded) before the current cell – these are the neighbors in the West, North West, North and North East directions. There can be 256 different combinations of symbols in the context; some of these combinations indicate which symbols cannot occur in the currently processed cell. For example, when the symbol 1 or 2 is attached to the cell to the North and the symbol in the cell to the West is 1 or 3, then the current symbol cannot be 2 or 3 either. This fact allows shortening the fractal code.

Chapter 3. Fractal Compression Methods

35

(a) Chain code [Sau96d]

(b) Region edge maps [Har00] (c) Quadtree-based region edge

maps [Och04]

Figure 3.4. Performance of compression methods with irregular partitions.

The irregular partitions guarantee good results of the encoding. Such partitioning schemes are ultra adaptive to the image content and since they are right-angled they are devoid of the drawbacks of triangular partitioning. The experiments (see figure

3.4) show that they outperform any other partitioning method. However, there is still

disagreement which method is superior, what will be explained in the last section of this chapter.

3.2. Domain Pools and Virtual Codebooks

The two terms – the domain pool and the codebook are very close connected with each other. In the literature, they are often used interchangeably, but here by a domain pool, in the context of fractal coding, the author means a set of domains (a subset of all possible domains in the image) that is being used during searching for a matching domain for a range. The codebook blocks correspond to domain blocks but their size is the same as the size of the range. The set of all codebook blocks is called virtual codebook. The codebook in fractal compression is virtual because it is not needed at the decoding (it is not stored in the fractal code) – it is used only during the encoding phase. Summarizing, the codebook denotes a set of codebook blocks, which are contracted (downfiltered) domain blocks from the domain search pool.

The length and contents of the domain pool (codebook) is crucial for the efficiency of the encoding process. If the domain pool is larger then more bits are required for the representation of selected domain in the code. At the same time, larger domain pool entails longer time for searching a domain for a range; this results in much extended encoding time. However, larger domain pool also has a positive effect – it helps to achieve higher fidelity.

There are two main approaches to domain search that can be observed in different encoding methods. The first one, called global codebook search, provides the same domain pool (codebook) for all ranges of the image but there may be various the domain pools and codebooks for different classes of ranges. Local domain pool search, the second approach, makes the codebook dependent on the position of the range.

Chapter 3. Fractal Compression Methods

3.2.1. Global Codebooks

36

This solution to domain search is based on an assumption that a range and a domain can be paired into a transformation even if they lie in completely different parts of the

image. This assumption is confirmed by [Fis92c, Fis95a, Fri94] where authors state

that there cannot be determined a neighboring area of a range within which the best domain for the range lies.

An example of a global codebook can be seen in section 2.2.2. In the example fractal

encoding algorithm, each domain block of the image is considered during the searching of matching domains and ranges. Because the algorithm employs uniform partitions, the domain pool consists of blocks of same size. The interval between corresponding borders of neighboring domain blocks is equal to one pixel vertically or horizontally.

This solution is very complex computationally due to large number of blocks within the codebook. The time cost is here very high but this procedure gives optimal loss of information because the best matching between a range and a domain will be always found.

In order to reduce the time cost larger intervals between blocks, which are appended to the domain pool, are introduced. The literature gives two typical interval values: equal to the domain-block width or to half of the domain-block width. This simple move significantly decreases the number of domains in the pool and, thanks to that, speeds up the searching for a domain. The main rule is that the larger the domain pool is the better fidelity is achieved but with higher time cost. So reducing the size of domain pools gives shorter searching time (and shorter encoding time), but more information is lost (the errors between paired ranges and domains might be larger).

Higher intervals between the domains in the pool result also in better convergence at the decoder (less iterations are required to decode the image).

The global codebook constructed like above can be used when the image is segmented into uniform range blocks or with quadtree partitioning. It can be also used with HV partitioning, but a domain pool containing ranges (larger than currently processed range) or blocks created by the partitioning mechanism (used also for determining range-blocks) are more often used. These two last methods of constructing global domain pools can be also used with other adaptive partitioning schemes.

In the quadtree scheme there is not one global domain pool but several – for each class of ranges (all ranges within one class have same size) is provided a separate domain pool (and codebook) that contains domains twice as large as ranges within the class.

3.2.2. Local Codebooks

density of the spatial distances between ranges and matching domains has a distinct peak at zero. This means that it is much more likely to pair a range with a domain that is close to the range than with a distant one.

The literature gives several ways in which the advantage of this fact can be taken.

Chapter 3. Fractal Compression Methods

37

(b) (from [Woo95])

Figure 3.5. Probability density function of block offset.

Restricted Search Area

In fact, the probability that a distant domain will be judged as a matching one is so small that the searching can be restricted to only spatially close domain blocks. The

remaining part of search algorithm remains unaffected. [Jac93]

Spiral Search

In the approach the search order is modified - the codebook blocks that are more likely to provide a good match for currently processed range block are tested first.

Therefore, the search is performed along to a spiral-shaped path, which has a beginning in the codebook block directly above the range block and gradually recedes from the range. The search area can be here restricted by defining maximal number of range

blocks that shall be tested for each range block – the length of the path. [Bar94b]

Figure 3.6. Spiral search (from [Bar94b])

Chapter 3. Fractal Compression Methods

38

It can be noticed that the density of the domain blocks tested during the spiral search is higher at the beginning of the path (close to the range block).

Mask

Another way to determine a not numerous domain pool is to put a mask on the image and center it at the currently processed range block. The mask indicates the locations of domain blocks that should be included into the domain pool. These locations are denser near to the center of the mask and condensation decreases with the

Solutions Without Domain Search

There are several ways to eliminate the time-consuming domain search. The first of them pairs a range and a domain when the position of the domain block fulfills some

conditions. For example, P. Wakefield [Wak97] proposes to pair domains with ranges

in such manner, that the range block lies within the domain block and the dominant edge should be in the same relative position in both blocks. Other solutions force the

matching domain to be in a fixed relative position to the range [Mon92] or restrict the domain pool to a very small set of domains neighboring to the range [Mon93a].

Because this class of fractal methods eliminates one of the most time-consuming phases, the encoding is very significantly accelerated. At the same time, the search-free

methods give best rate-distortion results [Woo95]. However, in medical imaging the in-

formation carried by the image is much more important than the achieved compression ratio and the search-free methods loose details by imprecise matching of domains and ranges. But without any doubt, it can be said that local codebooks outperform global

ones what have been proved in [H¨

.

3

dB

lower for the search with a mask than for a full search; at the same time the domain pool contained only 12% of the domains from the global pool.

3.3. Classes of Transformations

As it was already said, the transformations determined during encoding have to be affine and contractive. However, this restriction is very weak and further limitations have to be introduced in order to provide full automation of the encoding process. Thus searching for the transformations that will constitute the fractal code is performed only within a limited class of affine transformations. The choice of this class influences the effectiveness, fidelity of the algorithm and convergence properties of the fractal operator. Thus, the importance of selecting the right class of transformations cannot be overrated since it is crucial for both process of compression – encoding end decoding.

A transformation usually can be decomposed into three separate transformations that are carried out one after another. Therefore, a single elemental block transformation

τ i

(from the domain block transformations:

D i

to the range block

R i

) is a composition of three

τ i

=

τ i

I

τ e

S

τ

C

After the transformation

τ i

is placed on the domain block

D i

, the resulting pixels may be copied into the range block

R i

. Thus, transformation

τ i

is the key part of the affine transformation

w i

, which maps domain block

D i

onto range block

R i

.

Chapter 3. Fractal Compression Methods

39

In order to transform a domain block into an appropriate range block firstly the domain block is spatially contracted (transformation

τ

C

), the product of this phase is a codebook block. The order of pixels within one codebook block is deterministically changed by

τ

S e

, i.e. it is undergone one of symmetry operations like rotation or reflection.

The used symmetry operation is taken from a fixed pool;

e

denotes here the index of the used operation. The last component transformation

τ i

I

which adjusts the brightness of the codebook block.

is an intensity transform,

The contraction transformation usually is the same for all domains. However, the symmetry operation is known not before the searching for matching pairs of domains and ranges. Same situation is with intensity transformation – when a domain and a range are compared (during the searching), this transform is defined in such way that the error between them is minimized.

All domains

D k

(0 are transformed by

τ

C

¬ k < D

, where

D

– length of the domain pool) from the pool what gives the codebook of blocks

C k

. The codebook can be expanded thanks to symmetry operations – every block of the codebook is transformed by all symmetry operations and the products of these operations are included in the codebook. Theoretically, this step should allow better matching between the codebook block and the range block (during searching a domain block fitting to a range block).

One has to keep in mind that the codebook is virtual, i. e. the codebook blocks are not stored in four copies that differ from each other only with the rotation angle – there is a single copy of a codebook block that is rotated during the search.

Then the real search is being performed. For a range block

R i

block

C k

every codebook is checked – the coefficients of the intensity transformation (that minimize the distance between the codebook block and the range block) are calculated, i.e. the transformation

τ ik

is being determined. From all of the

τ ik

(and, at the same time, from all of the codebook blocks) the one is picked that gives the minimal error between the range block and the product of the transformation – the chosen transformation becomes

τ i

.

The description of the contraction transformation

τ

C

can be sawed into the program

– the transformation is the same for all domains/ranges and the same for the encoder and the decoder. But information about the

τ

S e

and

I i

has to be attached to the fractal code. In particular, the symmetry operation musts be pointed out and the coefficients of the intensity transformations must be stored for every range block.

3.3.1. Spatial Contraction

The spatial contraction of domain is not necessary for the process of fractal compression. The transformation must be contractive but the metrics that are used to assure

the contraction usually is not influenced by the spatial dimension [Dav96, Fis92c]. A

sufficient constraint is that a domain block and a range block paired into one transformation cannot be equal. However, the spatial contraction is commonly used in almost

fractal compression methods. It was introduced by Jacquin [Jac90b] and it is moved

out directly from the first fractal compression algorithm where the spatial size of square domain blocks was twice as large as the size of range blocks.

Also using the same contraction ratio as Jacquin became a custom – the spatial contraction usually reduces the dimensions of a domain block by two. However, it is possible to adjust this number in order to achieve desired behavior of the encoder or

Chapter 3. Fractal Compression Methods

40 decoder. A contraction ratio higher than 2 : 1 decreases the number of iterations needed

to reconstruct the image from fractal code (fractal operator) [Bea90]. It is possible to

adjust the contractivity in such way that the decoding will be made by a single iteration

[Fis95a]. A contraction ratio smaller than 2 : 1 entails higher error propagation during

decoding. But it also has positive effects – it allows better approximations of range

blocks with codebook blocks [Bar94b].

In the original work of Jacquin [Jac90b], the domain block was contracted by aver-

aging of four neighboring pixel values into one. So according to this, when the width and the height of the codebook block are equal to

h

and the contraction is made by factor of 2 then a value of a pixel of a codebook block following formula:

C i

can be calculated from the

C i

(

m, n

) =

D i

(2

m,

2

n

) +

D i

(2

m

+ 1

,

2

n

) +

D i

(2

m,

2

n

+ 1) +

D i

(2

m

+ 1

,

2

n

+ 1)

4 for all

m, n

0

,

· · ·

, h

1. This formula can be easily generalized to any size of the codebook block.

The contraction by neighboring pixel averaging is still very popular but also other

allowed to obtain better coding results [Bar94b]. Instead of averaging neighboring pix-

els, the excess pixels can be removed. This solution slightly speeds up encoding but

has negative influence on the accuracy [Fis92c, Fis95a].

3.3.2. Symmetry Operations

The symmetry operations, called also isometries, operate on pixel values of a block without changing their values. They change the positions of pixels within the block in

a deterministic way. For a square block, there are eight canonical isometries [Jac90b]:

1. identity

2. orthogonal reflection about mid-vertical axis of block

3. orthogonal reflection about mid-horizontal axis of block

4. orthogonal reflection about first diagonal (

m

=

n

) of block

5. orthogonal reflection about second diagonal of block

6. rotation around center of block, through +90

7. rotation around center of block, through +1800

8. rotation around center of block, through

90

The isometries significantly enlarge the size of the domain pool so they should take effect in better fidelity of the reconstructed image. According to a number of re-

searchers, all isometries are used in same frequency during encoding [Fri94, Mon94b].

This proves that they are useful and fulfill their destination. At the same time, other authors prove that the isometries are dispensable and have no positive effect

[Jac93, Lu97, Mon94b]. Probably different design choices not directly related with the

isometries are the main cause of this contradiction. [Woh99].

However, an overwhelming agreement can be observed in the literature, that the

use of the isometries results in weaker rate-distortion relation [Mon94b, Kao91, Sau96a,

Woo94].

Besides, other affine transformations can be used in place of isometries. [Lu97]

Chapter 3. Fractal Compression Methods

41

3.3.3. Block Intensity

The last component transformation also operates on pixel values but it changes the luminance of pixels instead their positions. Once again, the most basic intensity transformation was introduced already by Jacquin. It is linear and operates on one codebook block (after application of symmetry operations) and one block of unit components:

C i

0

=

s i

C i

+

o i

1

The

s i

and

o i

denote the scaling and offset respectively. These coefficients are calculated by the encoder when the best approximation

R

≈ s k

C k

+

o k

1

is found (0

¬ k < C

, where

C

– length of the codebook).

Although the linear intensity transformation still can be found in many more present fractal compression methods, other transformations can be found in the literature.

According to the authors, these new approaches improve the fidelity of the compression by enabling better approximation of a range block by a codebook block.

Orthogonalization

Øien [Øie94b] modified the intensity transform by introducing orthogonal projection

prior to scaling. From the codebook block the

dc

component is being subtracted. The

dc

denotes the mean pixel value of the codebook block:

dc

=

C i

(1)

+

· · ·

+

C i

(

C i

)

C i

where

C i

is the number of pixels in a codebook block.

The intensity transform in this case can be described by following formula:

C i

0

=

s i

C i

− h

C i

,

1

i

!

k

1

k

2

+

o i

1

The

h

C i

,

1

i

is the inner product of the codebook block and the block of fixed intensity,

k

1

k

is the derived norm of an appropriate product space, i.e.

L

2 here.

This transformation yields a block that is orthogonal to the block of unit coefficients –

1

and gives several advantages. First of all the

s i

and the

o i

coefficients are decorelated. When a special choice of domain pool is made (each domain is a union of range blocks in quadtree partitioning, the contraction based on pixel averaging), the decoding is accelerated – the convergence of the decoder is guaranteed to be in a finite number of iterations. The number of iterations is independent of the

coefficients and only the sizes of the domains and ranges influence it. [Øie93]

s i

and

o i

Multiple Fixed Blocks

The topic of multiple fixed blocks was raised by Øien, Lepsøy and Ramstad [Øie91] and continued by Monro [Mon93c, Mon93b] and many other researchers. The main

idea is based on replacing the single fixed block

1

with multiple fixed blocks

V h

:

C i

0

=

s i

C i

+

X

o ih

V h h

Chapter 3. Fractal Compression Methods

42

Multiple Codebook Blocks

Another approach uses several codebook blocks that are independently scaled:

C

0 i

=

X

s ih

C ih h

+

o i

1

It is also possible to merge the multiple fixed blocks approach with the multiple codebook blocks approach. In this case, also domains that do not have to be spatially

contracted can be used. [GA94b, GA94a, GA96, GA93, Vin95]

The linear combination multiple domain blocks and multiple fixed blocks was used

in [GA96] and resulted in great rate-distortion relation – at the bitrate 0

.

43, the peak signal to noise ratio achieved 34

.

5

dB

.

Polynomials

Other attempt to the intensity transformation [Mon93a, Mon94b, Mon94a] resigns

from the linear character and uses higher order polynomials. When the transformation is a second order polynomial then an additional component is added to Jacquin’s basic transformation – the codebook block with quadratic form:

C i

0

= h

c

0

| s i

2

c

2

+

s i

1

c

+

o i

1

i where

c

symbolizes a matrix coefficient of

C i

and

c

0

a coefficient of

C i

0

. The third order polynomials will require extending the transformation with one more component – codebook block with cubic form:

C i

0

= h

c

0

| s i

3

c

3

+

s i

2

c

2

+

s i

1

c

+

o i

1

i

When the basic linear transformation is used then a single pixel of the codebook block is undergoes the following intensity transformation:

τ i

I

(

z

) =

s i z

+

o i

The application of the polynomials modifies the shape of the fractal operator (com-

pare to section 2.2.1) [Kwi01]. Here the operator takes following form:

w i

x y z

=

a i b i

0

c i d i

0

0 0

τ i

I

 

x y

1

 

+

e i f i

0

The

τ i

I

(intensity transformation) looks as follows:

order 2 polynomials

τ i

I

(

z

) =

s i

2

z

2

+

s i

1

z

+

o i

1

order 3 polynomials

τ i

I

(

z

) =

s i

3

z

3

+

s i

2

z

2

+

s i

1

z

+

o i

1

Of course, also higher order polynomials can be applied but it results in worse compression ratio because more parameters have to be encoded. However, the higher

order polynomials are used the better fidelity can be achieved [Lin97]. The use of second

order polynomials turns out to be the best when it comes to the rate-distortion relation

[Woo95].

Chapter 3. Fractal Compression Methods

43

3.4. Quantization

The quantization occurs in encoding as well as decoding. During the encoding, the scaling and offsets coefficients have to be quantized. The domain positions, the description of used symmetry operations and any partition description relevant to the adaptivity of the segmentation are represented by discrete values from the beginning.

3.4.1. Quantization During Encoding

Most often, a uniform quantization is used. Nevertheless, the distribution of the scaling or of the offset coefficient in general has a strongly non-uniform character.

The application of a uniform quantization method entails inefficiency and entropy compression of quantized coefficients can be very useful for eliminating it.

(a)

s i

2

·

10

3

(b)

s i

1

(c)

o i

Figure 3.7. Distributions of scaling and offset coefficients (second order polynomial

intensity transformation) (from [Zha98])

.

The coefficients are stored on various numbers of bits in solutions of different re-

searchers. The bit allocation for the scaling coefficient takes values from 2 ([Jac93]) to

5 ([Øie94a]) and for the offset coefficient from 6 ([Jac93]) to 8 ([Øie94a]). The combi-

nation of 5-bit quantization of the scaling coefficient

s i

offset coefficient

o i

was found to be optimal [Fis95b].

and 7-bit quantization of the

Besides uniform, also logarithmic and pdf-optimized quantizators were investigated by researchers. The logarithmic quantization did not turn out to be better in the

Chapter 3. Fractal Compression Methods

44

context of fractal compression [Fis95a]. The pdf-optimized quantization shrank the bit

allocation for parameters for a single domain block to 5 – 6 bits with small costs of the

fidelity [Øie94a].

The quantization of the coefficients can be made directly before adding them to the fractal code. However, many algorithms, especially those that pay special attention to the fidelity, quantize the coefficients before computing the error between a range block and a transformed domain block. This solution slows down the encoder because not only the final coefficients but all of the scaling and offset coefficients (that are calculated for any domain from the pool during the search) are being quantized. But the quantization operation may have influence on the error value between a range and a domain (after contraction and isometries). Thus, not necessarily the same domain will be indicated as the closest to a given range when the blocks are described real coefficients and after applying quantization. Quantization before the error computation also ensures that both the encoder and the decoder use the same coefficient values.

The scaling and offset coefficients in transformations without orthogonalization are

cients together can be vector quantized [Bar94b, Bar93, Har97]. The offset can be also

3.4.2. Quantization During Decoding

The quantization occurs also during decoding. Each iteration of the algorithm produces an image that is an approximation of the fixed point of the IFS. In the original approach, the images created in successive iterations were stored as raster images, i.e.

the pixel values were quantized. However the brightness of the fixed point’s pixels takes real values and not discrete and the error caused by quantization in this solution is propagating on the result of following iterations. This may cause difficulties with reaching the correct values of brightness of some pixels.

This problem can be minimized by introducing matrices of real numbers to represent the images created in successive iterations. This solution is called the

Accurate Decoding with Single-time Quantization

and guarantees that the quantization will be performed only once – when the matrix from the last iteration will be converted to a raster image.

[Kwi01]

3.5. Decoding Approaches

The fractal code contains the quantized coefficients of the fractal operator. The decompression is actually the process of computing the fixed point described by this operator. The fractal operator is independent of the size of the original image so the decoding may result in a reconstructed image in any size – the image may be zoomed in or zoomed out comparing to the original one.

The basic decoding algorithm is based on PIFS and was already explained in section

2.2.1. One of advantages of fractal compression is fast decoding – usually it takes less

than 10 iterations. However there are introduced some alternative approaches that improve the speed or accuracy of the process.

Chapter 3. Fractal Compression Methods

45

3.5.1. Pixel Chaining

The method can be utilized only when the intensity transform is based on subsampling. In such situation, each pixel of a range block is associated with one reference pixel in the domain block – the range and the domain are paired by a transformation.

The reference pixel lies not only in the area of the domain block but also in the area of some other range block. Thus, another reference pixel is associated with it. In this way, a chain of pixels that are associated is created.

The pixel chain can be used in two manners. The first way is utilized it to track the path of influence in order to find a pixel with wanted value. The second way executes

a part of the chain long enough to achieve acceptable pixel value. [Fis95a, Lu97]

3.5.2. Successive Correction Decoding

The basic decoding algorithm uses for each iteration a temporary image in which the changes are made by transformations. This means that the image that provides the virtual codebook in current iteration remains unchanged by the transformations and the range blocks are situated on the temporary image.

The successive correction method is inspired by Gauss-Seidel correction scheme.

The basis of the successive correction algorithm is resignation from the temporary images – the transformations operate on the same image. The domain blocks covering actually decoded range blocks are immediately updated, i.e. the change made by one transformation is visible for transformations executed after that one but in the same iteration.

The main advantage of this technique is increased decoding speed. A further speed improvement can be made by ordering the transformation. In the image are staked out domains with different density. Transformations that have domain ranges in areas of

the highest domain concentration are executed first in each iteration. [Kan96, Ham97b]

3.5.3. Hierarchical Decoding

The first stage of hierarchical decoding is actually nothing else as the baseline decoding algorithm. The only difference is that the image is reconstructed at a very low resolution – the size of the range blocks is reduced to only one pixel. This low-resolution image is treated as a basis to find the fixed point in any other resolution with a deterministic algorithm (similar like in wavelet coding – the transformations from domains to ranges are treated as consecutive resolution approximations in the Haar wavelet basis).

Because vectors of lower dimensions are processed during IFS reconstruction, there are considerable computational savings comparing to the standard decoding scheme.

[Bah93, Bah95, Mon95]

3.5.4. Decoding with orthogonalization

This approach was already mentioned in section 3.3.3. It requires some changes of

the encoding process, i.e. all domain blocks from the pool have to consist of a union of range blocks and the intensity transform has to utilize orthogonalization. These restrictions result in meaningful benefits: an uncomplex computationally decoding algorithm based on pyramid-structure, decoding length independent of the transformation coef-

Chapter 3. Fractal Compression Methods

46 ficients (it depends only on domain and range sizes) and at least as fast as in the basic

scheme. [Øie93]

3.6. Post-processing

Any fractal compression method is based on blocks and, because of this, block artifacts are always present in the reconstructed image. In order to reduce the undesired artifacts the reconstructed image can be post-processed. The block boundaries are

subjected to smoothing. [Fis95a, Lu97]

There are at least several ways to reduce the block artifacts during post-processing.

The first one is simply the right choice of partitioning method – the overlapped ranges give very good result, the blocks are also less noticeable when a highly adaptive partitioning method is used.

A simple method that uses a lowpass filter can be engaged. However, the results

are not satisfying [Fis95c]. Other estimation-based methods, more complex, give better

performance [Zak92, Ste93].

There are also post-processing methods that depend on the partition scheme used in the compression and heads for the best overall performance taking into consideration

the human visual system. The Laplacian pyramidal filtering presented in [Lin97] is an

example of such method.

3.7. Discussion

The chapter presents the diversity of issues connected with building a fractal compression method and at the same time the large diversity of the methods. Although the basis of fractal compression remains the same in all implementations, there still is notable latitude during the act of constructing a fractal compression method because there are no standards to it and only a general idea how to utilize the fractal theory to image compression. This freedom can be problematic because there is not always agreement which solutions in particular elements of the fractal compression method yield the best effects. This confusion is being amplified by the fact that each design decision influences on the performance of other design elements.

As an example, the choice of partitioning scheme can be given there – there is a disagreement in the literature which one is the best. Some researchers appoint the simple quadtree scheme as the superior comparing to polygonal and HV parti-

tions [Reu94]. Others, at the same time, prove that the HV partitioning gives better results than the quadtree [Fis94, Har00, Ruh97]. However, most of the researchers

agree that irregular regions give better results in rate-distortion sense over quadtree

scheme [Och04, Sau96d, Cha97, Ruh97, Har00, Bre98]. The comparison of HV with

irregular schemes does not show as large superiority of the method based on irreg-

ular regions [Har00], especially the methods utilizing quadtree partitioning in the

split phase [Cha97, Cha00, Och04], or even these two approaches yield very similar

rate-distortion performance [Ruh97]. One can notice that for small compression ratios,

for which the best fidelity can be obtained, the HV partitioning results in slightly better

Peak-Signal-to-Noise ratio. However, the irregular-shaped approaches allow encoding

Chapter 3. Fractal Compression Methods

47 an image with the same PSNR ratio faster. A remarkable observation is that none of the partitioning schemes that are not based on right-angled regions matches to the

performance of above-mentioned methods [Woh99].

The effectiveness of fractal compression can be improved by merging it with transform coding or wavelets. Nevertheless, such hybrid-methods are not discussed in the document.

Chapter 4

Accelerating Encoding

The main drawback of fractal image compression is its computational complexity and resulting from it long encoding time. The most time-consuming part of the encoding scheme is the search through the domain block pool in order to find the best possible match for a given range block. The time complexity of the encoding is

O

(

n

), i.e. the time spent for each search is linear in the number of domains (

n

) in the pool.

The researchers have undertaken many attempts to accelerate the encoding process.

The solutions proposed by them can be divided into two groups: complexity reduction techniques and parallelization of the implementation. The chapter presents short descriptions with explanations of the most successful acceleration techniques.

4.1. Codebook Reduction

The reduction of the codebook/domain pool size is the simplest and most obvious speed-up techniques. The first way to achieve this is to utilize the local codebook

instead of the global codebook or the full search (compare with section 3.2). There are

also other techniques that decrease the size of the domain pool independently of its type.

The domain pool can contain domains that are very close to other (in error measure sense). Eliminating such domains allows to significantly reduce the size of the domain pool without loss of the fidelity – when the distance between the domains is below certain level then after contraction they will become the same (or almost the same) codebook blocks. The method utilizes the invariant representation, which will

be explained below. [Sig97]

A block with low variance cannot be changed into a block with higher variance by any transformation that is considered in fractal coding. However, uniform or low-variance blocks can be generated with fixed block or absorbent transformation.

[Jac90a] Thus, a range block has to be paired with a domain block with higher vari-

ance to create a contractive affine transformation and there is no need for keeping the low-variance domains in the pool. The awareness of this fact allowed Saupe to create a domain pool reduction technique that excludes a fraction of the domain pool from the search equal to 1

α

. The size of the fraction was adjusted with the parameter

α

(0

,

1]

Chapter 4. Accelerating Encoding

49 in order to investigate the impact of the pool reduction on the image fidelity, computation time and compression ratio. The results for encoding with Fisher’s quadtree partitioning scheme are as follows. The computation time is directly proportional to the parameter

α

and the reduction of the domain pool with this method does not influence negatively the fidelity (even for low values of

α

, e.g.

α

= 0

.

15) and even it

can slightly improve the fidelity. [Sau96b]

4.2. Invariant Representation and Invariant Features

During the search for matching domains and ranges, the domain blocks and range blocks cannot be compared directly. The distance is measured between the range block and the transformed domain block – after contraction, isometries (not in all algorithms) and intensity transform. Thus the problem is not only in finding the correct domain block for the range block but also to find the transformation parameters that will minimize the distance between the blocks.

The invariant representations/features of blocks were introduced to fractal compression in order to ease the distance measure by enabling direct comparison of the domain block and the range block.

The original version of invariant features proposed by Novak utilizes a 4-dimensional

feature vector to each block [Nov93]. The components of the vector are invariant mo-

ments, which are defined from the gray level distribution of the block. One vector fully suffices for one domain because it is insensible to any geometric transformation (isometries) of the domain block. The shape of the feature vector depends on the luminance of the block. To solve this problem, a normalization procedure (with respect to mean and variance) was introduced.

There are several drawbacks of this method. There is no (and there cannot be) a theory that would motivate the argument that closeness of the feature vectors ensures closeness of the range and domain blocks in the error measure sense. Another problem is that the blocks with negative intensity are not considered at all; this problem can have negative effect on the fidelity. The nearest neighbor search is impossible in this approach without logarithmic rescaling because the components of the vectors take

values from various orders of magnitude. [Sau96c]

The method elaborated by Novak was originally designed for triangular partitioning.

Frigaard adapted his work to quadtree partitioning but he removed the normalization from the method because according to him it can decrease the quality of the encoding.

[Fri95]

but also by Popescu and Yan [Pop93].

Generally, the invariant representation techniques differ from the invariant features because they cannot be invariant to the block isometries. However, they are invariant to the block intensity. The basic approach is based on an orthogonal projection of the block onto the orthogonal complement of the space spanned by the fixed block coefficients, which is followed by a normalization. Other approach utilizes the DCT

(after applying the transform the

dc

coefficient is zeroed), followed by normalization.

[Bea90, Sau95b, Woh95]

Chapter 4. Accelerating Encoding

50

A great advantage of the invariant representation, besides the search time reduction, is the possibility of adaption of the distance measure to the properties of the human

visual system. [Bea90, Bar94a]

4.3. Nearest Neighbor Search

This method of accelerating the search boils down the range-domain block matching problem to a nearest neighbor search problem. The time complexity is reduced from

O

(

n

) to

O

(log

n

).

Prior the proper search, a preprocessing stage is being performed, where a set of the codebook blocks to be searched is arranged in an appropriate data structure – tree structures are usually used. The nearest neighbor search utilizes the invariant representations of blocks, for which a function is provided that gives the Euclidean distance between the projections of a domain block and a range block. The minimization of this distance shall be equivalent to the minimization of the error between the domain and the range. Thus, the set contains codebook blocks that are the closest to the actually considered range block in terms of the Euclidean distance measure.

Many algorithms were developed to determine the neighborhood of the range block

– existing techniques [Ary93, Sam90] have been applied to fractal compression [Kom95,

Sau95b, Sau95a] as well as new ones have been especially designed [BE95, Cas95].

4.4. Block Classification

Both type of blocks, domain and range, have features that can be used to classify them. Each block is contained only by one class. This allows delimiting the search process only a part of the domain pool – the domain has to be in the same class as the range. Although the time complexity is still linear, the classification of blocks reduces the factor of proportionality in the

O

(

n

) complexity. The literature provides many different classification schemes, which can be divided in three groups discussed in following subsections.

4.4.1. Classification by Geometric Features

The block classification occurred already in Jacquin’s work [Jac89, Jac90a, Jac92]

were the classification designed for vector quantization by Ramamurthi and Gesho

[Ram86] was adapted for the purpose of fractal compression. In this scheme, the do-

mains are divided according to their block geometry into four classes: shade blocks, simple and mixed edge blocks, and midrange blocks. The shade blocks class contains blocks only with very low variance of the intensity. To the classes of edge blocks belong blocks where strong changes of the intensity are observed. A block is “midrange” when it has considerable variance but with no pronounced edge.

Since the shade blocks can be replaced with fixed block or an absorbent transformation imposed on block with higher variance, the whole class of shade blocks does not have to be searched during matching a domain with a range (compare with section

4.1). Thus, the block classification method can be bound with domain pool reduction.

[Jac90a, Jac90b]

Chapter 4. Accelerating Encoding

51

The main drawback of this classification scheme is weak performance for large blocks or blocks with weak edges or strongly contrasted textures. Such blocks often are incorrectly classified due to inaccurate bock analysis what results in artifacts in

reconstructed images [Jac90a]. The speed-up is also not very impressive since there are

only four classes.

4.4.2. Classification by intensity and variance

The classification technique elaborated by Jacobs, Fisher and Boss works on square blocks subdivided into four quadrants. The upper left quadrant is marked with

B

1

, upper right with

B

2

, lower left with

B

3

, and lower right with

B

4

. For each quadrant the average pixel intensity

B j

and variance

V j

is computed (

j

= 1

, . . . ,

4). With symmetry operations, the block can be transformed in such way that the quadrants will be ordered in one of three ways:

B

B

B

1

1

1

­

B

2

­

B

2

­

B

4

­

B

­

B

­

B

3

4

2

­

B

­

B

­

B

4

3

3

No other order has to be considered because one (and only one) of above orders always can be attributed to any block. Thus based on the three orderings of the quadrants according to average intensity, the three major classes are defined.

When a block is classified into one of the major classes then the subclass is computed. The subclasses are defined by the order of the quadrants (according to the average intensity) with the constraint that symmetry operations are not allowed. Since there are four quadrants, for each major class there are 24 subclasses.

The range and the domain blocks, which are bounded by a transformation

τ i

, must have the same order of the quadrants (when scaling coefficient

s i

>

0) or the opposite order (when

s i

<

0). Thus during the search two subclasses of domain blocks have to be searched in order to find the best match for a range block.

This method has one weakness – the impossibility to extend the search to neighboring classes. This would be very helpful when a domain that yields an error smaller that the admissible distance between a range and a domain cannot be found in either of the two subclasses.

The second classification scheme based on the intensity and variance, proposed by

Caso, Obrador and Kuo [Cas95], is devoid of this drawback. In this scheme, the major

classes remain as they were in the scheme of Boss, Fisher and Jacobs. However, the subclasses are based on strongly quantized vectors of variances. Each vector produces a class of domains and it is possible to point out neighboring classes.

4.4.3. Archetype classification

This classification scheme was elaborated by Boss and Jacobs [Bos95]. This method

defines the classes during preliminary learning phase, where several training images are examined. From a set of codebook blocks, one block is singled out for the role of archetype. This privileged block (

D a

) is the one that best approximates other blocks in the set:

D a

= arg min

D p

X min

s i

,o i k

D j j

=

p

(

s i

D a

+

o i

1

k

Chapter 4. Accelerating Encoding

52

An iteration of the learning process starts from an arbitrary classification of blocks belonging to training images. The authors of the scheme used here the Boss, Fisher and

Jacobs method based on intensity and variance. For each of these preliminary classes an archetype is computed. When all archetypes are known, a new set of classes is defined

– blocks are moved from the prior classes to classes with archetype that covers them best. The archetype-computation and reclassification is repeated in a loop as long as any change in the classification occurs in iteration.

The final set of archetypes can be used during encoding of any number of images.

When an image is being compressed, all domain blocks are classified using the set of archetypes – for each block an archetype that can it cover best is found (just after the image segmentation and domain pool definition).

Experiments carried out by Boss and Jacobs [Bos95] prove that a given range block

can be covered very well by a domain block that is classified to the same class. This solution guarantees the acceleration of the encoding process and the fidelity. The archetype classification is much slower than a conventional classification scheme for low quality image encoding since it is much more complex. However, when high fidelity has to be assured then the archetype classification turns out to be faster.

4.5. Block Clustering

The clustering method is similar to the classification. Blocks are here clustered around centers, which can be computed adaptively from the image to be encoded of from a set of training images. The use of the set of training images reduces time costs due to the lack of the computational cost of clustering during encoding. During the search, firstly the optimum cluster center is located and then the optimum domain within this cluster. The classes, i.e. the sets of blocks grouped around the cluster centers, are defined by the clustering algorithm.

The clusters define disjoint subsets of domains. The subsets are defined by cluster centers, which at the same time are their representatives. The distances between blocks within one cluster should be smaller than distances to blocks from other clusters. A criterion function is used to measure the quality of the clustering.

A range block is encoded in two phases. Firstly, the closest cluster center is found and, after that, the range block is compared to domain blocks within the cluster of the found center. This schema is very similar as in search within one class. Instead performing the search only through one cluster, also clusters of the centers neighboring to the closest center can be searched. This would increase the fidelity but with the cost of time.

There were several clustering methods introduced. They can be classified into three groups: based on generalized Lloyd algorithm, the pairwise nearest neighbor algorithm and self-organizing maps. Øien, Lepsoy and Ramstad presented an efficient clustering

method based on the Lloyd algorithm [Øie92, Øie93, Lep95]. En example of this class

of clustering methods is also proposed by Davoine, Antonini, Chassery and Barlaud

in [Dav96]. The nearest neighbor approach can be found in work of Wein and Blake

[Wei96] and Saupe [Sau95c]. The self-organizing maps were utilized already in the first

approach to adaptive clustering for fractal compression [Bog92] but with unsatisfying

result. Hamzaoui combined the self-organizing maps with the Fisher’s classification

Chapter 4. Accelerating Encoding

53 scheme (based on intensity and variance) and the Saupe’s nearest neighbor algorithm

[Ham97a].

4.6. Excluding impossible matches

The aim of the block classification or clustering is to group the most likely matches for actually considered range block. This method is based on an opposite foundation – the domain blocks that cannot be matched are excluded. The literature gives two ways how this can be achieved.

The first one utilizes inner products with a fixed set of vectors. The range and the domain are independently compared with a certain unit vector and the domain block can provide a good approximation to the range block only when the results of these comparisons are similar. Thanks to them, the lower distance bound is provided during the domain-range comparison, what allows eliminating many of the domains from the

precise distance measure. [Bed92]

The second solution is basing on the distribution of energy within the image blocks, which is treated as a feature of the blocks. These features are used to detect distance

inequalities. [Cas95]

4.7. Tree Structured Search

To facilitate the search, the blocks can be organized in a tree structure. An example of tree structured search can be the solution proposed by Caso, Obrador and Kuo.

[Cas95]

The tree is being built in a recursive algorithm as follows.

1

2

3

4

Choose size

s b

of the buckets of domain blocks and assign the whole domain pool to container

P

D

Choose two random domain blocks – they become the parent blocks.

Assign each of remaining domain blocks to the one of the parent blocks, which is closest to the currently considered domain block (in error sense). At this point, the initial set

P

D

is divided into two subsets

P

D

1 and

P

D

2

.

If the size of the subset

P

D

1 subset (

P

D

1 is greater than will take the role of

P

D s b

then perform steps 2 – 4 for this in that iteration). Repeat this step for the subset

P

D

2

.

During the search, a range is compared to the domain blocks starting from the root. At each level of the binary tree, one node is chosen that is closer to the range block. The path of the range finishes when a bucket leaf is encountered – the closest domain block is chosen from this bucket or from the path leading to this bucket and this finishes the search process.

The fidelity of this method can be improved by enhancing the search to some nearby buckets. The authors provide a numerical test based on the angle criterion to determine the neighboring buckets.

There are plenty of other accelerating techniques that use tree structure, e.g. the majority of nearest neighbor search techniques or multiresolution search.

Chapter 4. Accelerating Encoding

4.8. Multiresolution Search

54

The method described in this section has many names: pyramidal search, multigrid method, multiresolution analysis, multilevel approach and hierarchical representation

[Duf92, Ros84, Ram76, Tan96]. It was independently adapted to the fractal image

compression by Lin and Ventsanopoulos [Lin95a, Lin95b], Dekking [Dek95a, Dek95b]

and Kawamata, Nagahisa and Higuchi [MK94]. The multiresolution approach is based

on tree structure where each level represents blocks with different resolution. The root of the tree contains the domains in the coarsest resolution. Successive levels contain copies of the same blocks in resolution twice as large as the prior level (i.e. the block field is four times larger).

The search begins in the root of the tree, at each level of the tree considered are only these blocks that connected with best match from previous level, i.e. which are versions of the block found on prior level but with finer resolution. At each level of the pyramid, the range block is downsampled to the same resolution as the domain blocks. The computational cost, which is proportional to the product of the number of domains to be searched and the number of pixels in each block, is significantly reduced by these method.

Other methods that are related to the multiresolution approach were proposed by

Caso, Obrador and Kuo [Cas95] and Bani-Eqbal [BE95].

4.9. Reduction of Time Needed for Distance Calculation

An important part of the time is consumed by calculation of the distance between the range and the codebook block thus improvements in this area are desired.

The search time may be decreased by computing

partial distance

known from vector quantization. In this method an invariant representation is being constructed basing on

Hadamard transform (Haar transform was used for this purpose in [Cas95]) coefficients

in zigzag scan order. The transform shifts the most of the variance to the initial elements

of the vector. [Bea90]

The most time consuming part of the distance calculation is the computation of the inner product between codebook and range blocks. Effectiveness of these computations may be significantly improved by calculating the convolution (cross-correlation) of a particular range block with downfiltered whole image (in the frequency domain) – the inner products between the range and all codebook blocks are given. The method takes the advantage of the fact that the domain blocks (thus also codebook blocks) are overlapping in the image and does not bring about any trade-off between the fidelity and the speedup. Many of the acceleration methods may result in suboptimal choice of the match between domains and ranges but here the codebook block with minimal collage error is matched, i.e. no additional loss of information is caused.

Typically the computations of optimal scaling and offset coefficients,

s i

spectively, are carried out from the formulas: and

o i

re-

s j i

=

C j h

C j

, R i i − h

C j

,

1

i h

R i

,

1

i

, o j i

C j h

C j

, C j i − h

C j

,

1

i

2

=

h

R i

,

1

i − s h

C j

,

1

i

C j

Chapter 4. Accelerating Encoding

55 where

C j

is the number of pixels in a codebook block during the search in following

order (compare with the algorithm in section 2.2.2):

1

2 compute

D

C k

, C k

E and

D

C k

,

1

E for all codebook blocks

C k

for each range

R i

2. a

2. b

R

: compute

h

R i

, R i i

and

h

R i

,

1

i

for all codebook blocks

C j

C

:

2. b. i

2. b. ii

2. b. iii compute

h

C j

, R i i

compute coefficients

s

compute the distance

j i

and

o j i d

(

τ

I j i

(

C j

intensity transformation

)

, R i

), where

τ

I j i

C

=

s j i

C j

+

o j i

1

is the

It can be observed in this algorithm that the only values that have to be computed for one pair of codebook block and range block are:

h

C j

, R i i

, h

C j

, C j i

, h

C j

,

1

i

, h

R i

, R i i

, h

R i

,

1

i

because the coefficients

s j i

,

o j i

and distance measure

d

are functions of these values. From the five values, the computation of the inner products of the domains and ranges consumes the most time since it is performed in the most inner loop.

These computations are moved to the more outer loop by the application of the method based on fast convolution. The search process is modified in following manner:

1

2 compute

2. b

2. c

D

C k

, C k

E and

D

C k

,

1

E for each range

R i

2. a

R

: compute

h

R i

, R i i

and

h

R i

,

1

i

for all codebook blocks

C k

C

compute the convolution of the range

R i

with the downfiltered image for all codebook blocks

2. c. i

C j

compute coefficients

s j i

C

: and

o j i

2. c. ii compute the distance

d

(

τ

I j i

tensity transformation

(

C j

)

, R i

), where

τ

I j i

=

s j i

C j

+

o j i

1

is the in-

Also the computations of the products

D

C k

, C k

E and

D

C k

,

1

E can be also accelerated by the convolution technique.

4.10. Parallelization

The speed-up by reduction of the computational complexity gives significant results but it can be strengthen by parallelizing the coding algorithm. This is especially simple and promising since each range block is encoded independently – the domain searches

are also independent. The literature [Lin97, Kwi01] gives several possible parallel im-

plementations:

The process of choosing the best transformation parameters done in parallel for parts of the transformation area or for each transformation.

All range blocks are encoded in parallel – each range block is encoded in separate thread/process.

The domain block searches for each range block are implemented in parallel.

These are not exclusive options – for example, they can be combined with nested parallel pattern: encoding different range blocks and searching the domain blocks for each range block are both parallelized.

Chapter 5

Proposed Method

The review of fractal compression methods presented in chapter 3 reveals two ap-

proaches to partitioning that are regarded by researchers as the best in rate-distortion sense. The first one is based on hierarchical horizontal-vertical partitioning and the second one utilizes irregular regions.

(a) Performance of the compression methods according to the literature.

(b) Performance of the compression methods at compression ratios acceptable for medical images.

Figure 5.1. Comparison of fractal compression methods based on irregular-shaped regions and HV partitioning.

The figure 5.1, which presents the performance of the best fractal compression

methods using one of these partitioning schemes, clearly shows that irregular regions are superior to HV partitions at higher compression ratios. Since the image fidelity has to be preserved in medical imaging, the highest compression ratios will never be used and the HV scheme turns out to be better at low or very low compression ratios.

Chapter 5. Proposed Method

57

Although the irregular partitions built from product of quadtree-based split phase

[Och04] is way ahead of all other methods, there is no proof that it will give also

as good results at lower compression ratios. Therefore, since the increase of PSNR with reduction of the CR is smaller for the irregular regions, it is assumed that the horizontal-vertical approach is supreme at low compression ratios.

Thus, the horizontal-vertical partitioning was found to be the best option for medical images. This approach is implemented and tested. Nevertheless, it is also considered whether the best hierarchical partitioning method can be adapted for the use as the first stage (splitting) to construct irregular regions. The utilization of the HV method shall give further improvement comparing to the irregular regions based on quadtree-partitioning.

5.1. The Encoding Algorithm Outline

The design of the elaborated algorithm is typical for hierarchical partitioning meth-

ods. The outline of the algorithm is enclosed in the block diagram in figure 5.2.

The presented algorithm always finds the best matching codebook block to the current range block. However, there is considered also other approach, in which the search is being interrupted after finding first codebook block that yields error fulfilling tolerance criterion.

5.2. Splitting the Blocks

The method briefly described in section 3.1.3 is the basis of the block division

algorithm in the proposed compression method. However, some improvements are introduced. In the following description, the symbols

m

and

n

denote the index of the column and row. The size of the block to be divided is

width

(

R i

)

× height

(

R i

).

The original algorithm was able to detect the most significant edge when a row or column with higher mean intensity was prior during traversing the image. Two columns will produce a negative value of

v m

when the mean intensity of the pixels in the column

m

is smaller than the mean intensity in column

m

+ 1. Because the maximal value of

v m

indicates the edge that will be used for division, the most significant edge may be missed. Analogous situation will take place while determining the maximal

h n

. The original formulas are:

v m

= min(

m, width

(

R i

)

1

− m

)

width

(

R i

)

·

height

(

R i

)

1

X

r m,n n

height

(

R i

)

1

X

r m

+1

,n

n h n

= min(

n, height

(

R i

)

1

− n

)

height

(

R i

)

·

width

(

R i

)

1

X

r m,n

− m

width

(

R i

)

1

X

r m,n

+1

m

Chapter 5. Proposed Method

58

Figure 5.2. Hierarchical fractal encoding. Block diagram.

Chapter 5. Proposed Method

59

To ensure that the actual most significant edge will become the cutting line, the formulas used to calculate

v m

and

h n

are slightly modified:

v m

= min(

m, width

(

R i

)

1

− m

)

· width

(

R i

)

height

(

R i

)

1

X

r m,n

− n height

(

R i

)

1

X

r m

+1

,n n h n

= min(

n, height

(

R i

)

1

− n

)

· height

(

R i

)

width

(

R i

)

1

X

r m,n m

− width

(

R i

)

1

X

r m,n

+1

m

The first part of each of above formulas prevents from creating very thin or very flat rectangles. This will be called “Fisher’s rectangle degradation mechanism” and it utilizes the function

g

F isher

, which is put on the possible cutting line locations (indexes of the columns / rows): min(

x, x max

− x

)

g

F isher

(

x

) =

x max

where

x

∈ h

0

, x max i

and

x

is an integer.

Thus the formulas

v m

and

h n

can be presented in the following form:

v m

=

g

F isher

(

m

)

·

E v

ED

(

m

)

h n

=

g

F isher

(

n

)

·

E h

ED

(

n

) where the functions

E v

ED

horizontal line: and

E h

ED

are used to detect the most significant vertical and

E v

ED

(

m

) =

height

(

R i

)

1

X

r m,n n

− height

(

R i

)

1

X

r m

+1

,n n

E h

ED

(

n

) =

width

(

R i

)

1

X

r m,n m

− width

(

R i

)

1

X

r m,n

+1

m

The rectangle degradation prevention mechanism in above-presented formulas has a great influence on the product of these formulas. Because of this, the search for the most significant edge may be misled. Because of such mistake, a very significant edge might be found in the middle of a block that cannot be divided (because of the size of the block). In order to inspect if this feature of the original mechanism has a negative influence on the resulting image segmentation and compression quality, a simple alternate version of the above formulas were elaborated, which minimize the influence of the mechanism on the outcome of block splitting. This alternate formulas use a binary function for the degradation prevention. The function eliminates cutting lines location that cannot produce two rectangles meeting the range size threshold

size t

. Any other locations are given same weights:

g f lat

(

x

) =

(

0

when

(

x < size t

)

(

x max

1

otherwise

− x < size t

)

The functions

E v

ED

and

E h

ED

remain unchanged. The “flat” rectangle degradation prevention mechanism gives the following

v m

and

h n

functions:

Chapter 5. Proposed Method

60

v m

=

(

0

|

P

n r m,n

P

n when

(

m < size t

)

(

width

(

R i

)

1

− m < size t

)

r m

+1

,n

| otherwise h n

=

(

0

|

P

m r m,n

− when

(

n < size t

)

(

height

(

R i

)

1

− n < size t

)

P

m r m,n

+1

| otherwise

In the work devoted to optimal hierarchical partitions [Sau98], a different crite-

rion to determine the line, along which the block shall be divided. The split is here performed in such way that the sum of the square errors between resulting blocks and blocks consisting of pixels equal to the average intensity on appropriate range block are minimized. Thus, the cutting line minimizes the sum of the intensity variances within the two new blocks. The formulas for the errors are:

E v

V M

(

m

) =

m

X

height

(

R i

)

1

X

(

r i,j i

=0

j

=0

− dc lef t

(

m

))

2

+

width

(

R i

)

1

X

i

=

m

+1

height

(

R i

)

1

X

(

r i,j j

=0

− dc right

(

m

))

2

E h

V M

(

n

) =

width

(

R i

)

1

X

n

X

(

r i,j

− dc top

(

n

))

2

i

=0

j

=0

+

width

(

R i

)

1

X

height

(

R i

)

1

X

(

r i,j

− dc bottom

(

n

))

2

i

=0

j

=

n

+1

The minimal values of these formulas yield best vertical and horizontal cutting line.

If min(

v

0

, v

1

, . . . , v width

(

R i

)

1

)

¬

min(

h

0

, h

1

, . . . , h height

(

R

)

i

)

1

) then the division is done along the vertical line, otherwise along the horizontal one.

Also, a different rectangle degradation mechanisms are used. First of all the resulting rectangles cannot have less than 2 pixels in width and in height. Besides, the sum of square errors is multiplied by following function

g

Saupe

(

x

) = 0

.

4

"

2

x max x

1

2

+ 1

# where

x

=

m

and

x max

=

width

(

R i

)

1 or

x

=

n

and

x max

=

height

(

R i

)

1 depending on whether the

v m

or

h n

is being calculated.

The formulas for

v m

and

h n

in this approach are as follows:

v m

=

g

Saupe

(

m

)

·

E v

V M h n

=

g

Saupe

(

n

)

·

E h

V M

The

v m

and

h n

are not maximized like in Fisher’s method but minimized.

Altogether, there are possible six possible methods to divide blocks. These methods are created by combining different elements of the above-described methods:

Edge detection with Fisher’s rectangle degradation prevention mechanism

v m

=

g

F isher

(

m

)

·

E v

ED

, h n

=

g

F isher

(

n

)

·

E h

ED

Variance minimization with Saupe’s rectangle degradation prevention mechanism

v m

=

g

Saupe

(

m

)

·

E v

V M

, h n

=

g

Saupe

(

n

)

·

E h

V M

Chapter 5. Proposed Method

61

Edge detection with Saupe’s rectangle degradation prevention mechanism

v m

=

g

Saupe

(

m

)

·

E v

ED

, h n

=

g

Saupe

(

n

)

·

E h

ED

Variance minimization with Fisher’s rectangle degradation mechanism

v m

=

g

F isher

(

m

)

·

E v

V M

, h n

=

g

F isher

(

n

)

·

E h

V M

Edge detection with flat rectangle degradation mechanism

v m

=

g f lat

(

m

)

·

E v

ED

, h n

=

g f lat

(

n

)

·

E h

ED

Variance minimization with flat rectangle degradation mechanism

v m

=

g f lat

(

m

)

·

E v

V M

, h n

=

g f lat

(

n

)

·

E h

V M

When Fisher’ rectangle degradation mechanism is used with Saupe’s splitting technique (variance minimization) or Saupe’s rectangle degradation mechanism is used with splitting technique based on Fisher’s work (edge detection) then the mathematical function used to prevent from rectangle degradation has to be changed. The new function (

g

) is based on the original one (

g

):

g

(

x

) = max (

g

)

− g

(

x

). The max(

g

) denotes the maximum of the function

g

. Thus, Saupe’s rectangle degradation prevention mechanism will be changed to:

g

Saupe

(

x

) = 0

.

8

0

.

4

"

2

x max x

1

2

+ 1

# and Fisher’s mechanism to

g

F isher

(

g

(

x

)) =

x max

2 min(

x, x max

− x

)

x max

This change is necessary because the two splitting techniques use different methods do calculate the cutting line and to decide whether the horizontal or the vertical line shall be used.

One has to keep in mind that when the range block division is made along the most significant horizontal/vertical edge (splitting method based on Fisher’s work) then the maximal value from all of the

v m

and

h n

values points out the cutting line position

(

m

∈ h

0

, width

(

R i

)

1

i

,

n

∈ h

0

, height

(

R i

)

1

i

). When the division should minimize the variance in the produced blocks (Saupe’s approach) then the minimal value from all

v m

and

h n

values determines the position of the cutting line.

Independently from the approach used to split blocks and the rectangle prevention mechanism the blocks are divided only if there cannot be found a codebook block for current range. Such situation may be caused by the size of the range block – the range block has to be at least

CF

times smaller than the image that is being encoded (the

CF

denotes the contraction factor used by spatial contraction transformation

τ

C

). However, the most often occurring reason of the necessity to divide a range is not meeting the error threshold by the best match for the range. The error threshold (called also the

distance tolerance criterion

) is set by the user before starting the encoding.

Chapter 5. Proposed Method

62

The output of this phase also may be twofold. If there does not exist a division of the range block that will produce two blocks with sides larger or equal to the

range size threshold size t

(preset by the user), the range block will be sent to be stored in the fractal code. The second possible output is a pair of range blocks divided by the mechanisms that is in force for current encoding process. If the splitting ends with a success then the divided range block

R i

is omitted in further processing and the two new ranges

R i

1

,

R i

2 are added to queue of ranges to be encoded.

5.3. Codebook

In the proposed encoding method, the codebook blocks are described with its position in the spatially contracted image, block size and identifier of the symmetry operation put on the codebook block. During the encoding, the codebook blocks are not stored and processed as matricies of pixels but only their descriptions, which allow to load the proper pixels from image when they are needed.

For clarity, the term “sub-codebook” is introduced and shall be understood as the subset of codebook’s blocks, which contains only these codebook blocks that will be compared with a given range blocks during the search. All same-sized range blocks are compared with codebook blocks from the same sub-codebook. In fractal compression based on uniform partitioning there would be only one sub-codebook and it would contain all codebook blocks. The HV-partitioning method is characterized with high diversity of range block’ shapes, what results in a great number of sub-codebooks. The more sub-codebooks there are the larger the codebook is – more time is needed to build it and more memory is needed to store it.

Two different approaches to the codebook are proposed and investigated.

5.3.1. On-the-fly Codebook

The first type, called the “on-the-fly codebook” or the “light codebook”, is only a piece of functionality that takes on the input the position of the last considered codebook block and produces the next codebook block. If symmetry operations are allowed then also the identifier of the symmetry operation put on the last codebook block also has to be passed.

The on-the-fly codebook creates the codebook blocks descriptions during the search process (when they are needed) and automatically disposes the descriptions if the codebook blocks cannot be matched with currently considered range block. This means that there exists no collection of codebook blocks and that the sub-codebooks will be created independently for a each range block, even if there are range blocks that require exactly the same sub-codebooks (this takes place when two range blocks have exactly the same dimensions and the algorithm pairs each range block with best matching codebook block.

If the algorithm is not forced to find the best match for each range block, the building process of a sub-codebook for a concrete range block is ended when the first matching codebook block is found. After creation of a codebook block, its inner products are calculated (

h

C, C i

and

h

C,

1

i

, necessary to calculate the coefficients of the

Chapter 5. Proposed Method

63 intensity transformation). The independence in building sub-codebooks results in repetition of these calculations.

5.3.2. Solid Codebook

The “solid codebook” also provides a codebook block when it is needed during the search exactly in the same manner as the light codebook. However, the solid codebook is also a container for the codebook blocks, which is filled up in a preliminary phase. For all codebook blocks that will be compared with more than one range blocks, the inner products are computed only once because all created codebook blocks’ descriptions are stored for further use.

The drawback of this codebook is that there is a time cost caused by the preliminary phase. Another drawback is the fact that to the solid codebook block might be added blocks that cannot be included to any sub-codebook, what increases the time cost and memory use. Even if a codebook block falls to a sub-codebook it does not guarantee that it will be needed when the user does not wish to find always the best match and if all range blocks of the same-size will be paired with a codebook block before all other blocks within the same sub-codebook will be tested.

These drawbacks do not occur in the light codebook, where always only these codebook blocks’ descriptions are created that are needed but some of them have to be constructed (with inner product calculations) from the beginning more than once.

The most time-consuming part of creating the codebook blocks’ descriptions are the inner products’ calculations. In order to avoid calculating inner products for codebook blocks that will be never used during the encoding, it is possible to postpone these calculations until the first access to the codebook block.

5.3.3. Hybrid Solutions

In order to balance the advantages and disadvantages of the two presented approaches to codebooks, a hybrid codebook is introduced. Such codebook combines the advantages of the two above-presented approaches with downplaying their drawbacks.

Instead creating a solid codebook filled with blocks only a framework of the codebook is created in the preliminary phase. The solid codebook is filled during the encoding using the light codebook. Simply, before searching for a codebook block, matching to given range block, the algorithm checks if there were already encoded range blocks of the same size (i.e. range blocks that use the same sub-codebook). If there were such range blocks then successive codebook blocks are taken from the solid codebook, otherwise from the light codebook.

When the light codebook is in use then all codebook blocks given by the codebook are automatically inserted to the solid codebook. If the search shall be interrupted after finding first codebook block that fulfills the error tolerance condition (according to the user’s will) then all untested codebook blocks have also to be defined by the light codebook in order to make the solid codebook complete.

The larger a range block is the more possible proportions of its sides’ lengths exist.

Each such proportion generates a separate sub-codebook. At the same time, the larger a codebook block is, the smaller amount of same-sized codebook blocks can be packed in the image (offset between adjacent codebook blocks is the same regardless of their size)

Chapter 5. Proposed Method

64

– thus, most probably, the larger a codebook block is the less numerous its appropriate sub-codebook will be. This observation is the basis of the next way of combining solid and light codebook – all range blocks smaller than some given by the user size will use the solid codebook (which will be created only up to this preset size) and all larger range blocks will be bounded with the light codebook.

5.4. Symmetry Operations

The implemented compression algorthm can optionally use isometries while the codebook is created. The isometries are realized by permuting the pixels of the block.

The top left corner of the block remains in the same location after any of the symmetry transformation because by this vertex the location of the codebook block in the image

is described. There are eight possible symmetry operations (presented in figure 5.4):

identity:

τ

S

1

(

c i,j

) =

c i,j

orthogonal reflection about mid-horizontal axis:

τ

S

2

orthogonal reflection about mid-vertical axis:

τ

S

3

orthogonal reflection about first diagnoal:

τ

S

4

(

c i,j

(

c i,j

) =

) =

c j,i c width

1

− i,j

orthogonal reflection about second diagonal:

τ

S

5

(

c i,j

(

c i,j

) =

) =

c c i,height

1

− j height

1

− j,width

1

− i

rotation around point (max(

width, height

)

/

2

,

max(

width, height

)

/

2) by 90 deg:

τ

S

6

(

c i,j

) =

c j,width −

1

− i

rotation around point (

width/

2

, height/

2) by 180 deg:

τ

S

7

(

c i,j

) =

c width

1

− i,height

1

− j

rotation around point (min(

width, height

)

/

2

,

min(

width, height

)

/

2) by -90 deg:

τ

S

8

(

c i,j

) =

c height

1

− j,i

The symmetry operations are used in above order, i.e. when all codebook blocks transformed with symmetry operation indexed with

l

in above list were considered then the contracted domain blocks are subjected to the

l

+ 1 symmetry operation and the resulting blocks are added to the solid codebook or successively tested (light codebook).

The isometries enlarge the codebook (and each subcodebook) by eight times; thus, it should be possible to find a better match for each range block because there are more options. However, higher number of codebook blocks to consider results also in longer encoding time. Because the topic of the use of isometries is very controversial

(see 3.3.2), there will be checked whether there is any benefit from the use of them.

5.5. Constructing the fractal code

5.5.1. Standard Approach

The method of constructing the binary code for the fractal operator determined by the fractal compression methods based on HV partitions is similar in both Fisher’s

[Fis92b] and Saupe’s [Sau98] methods. The code consists of information about the

partition and the information regarding the transformations. The partition information is actually an overhead brought about by the HV partitioning method, there are methods, e.g. based on uniform squares or quadtree based, that do not require this type of information attached to the binary representation of the fractal operator. Here

Chapter 5. Proposed Method

65

(a) identity (b) orthogonal reflection about horizontal axis

(c) orthogonal reflection about vertical axis

(d) orthogonal reflection about first diagnoal

(e) orthogonal reflection about second diagonal

(f) rotation by 90 deg

(g) rotation by 180 deg (h) rotation by -90 deg

Figure 5.3. Isometries for rectangle block.

the description of the whole tree of range blocks is stored. The binary tree is created during encoding – when a block is divided then it automatically becomes the parent node of the newly created blocks. When encoding is finished, the range blocks that constitute the fractal operator are the leaves of the tree.

The proposed algorithm of this standard approach is as follows. The fractal code is being constructed by traversing the tree. Each node finds its reflection in the code but different information is stored about inner nodes and about the leaves. The description of any inner node

R

IN

consumes

L

bits:

L

=

(

1 +

d

log

2

1 +

d

log

2

(

width

(

R

IN

)

1

− size t

)

e when R

IN

(

height

(

R

IN

)

1

− size t

)

e when R

IN is divided vertically is divided horizontally

One bit is used to store information whether the block is divided vertically or horizontally and the rest of the bits contain the position of the cutting line. The number of the bits for the line position is various – the smaller the range is the fewer bits are needed.

Also the number of bits may be reduced by utilizing the knowledge of the minimal block size through elimination of the impossible positions of the line – the range size threshold can be passed once for all ranges (in the file header).

Chapter 5. Proposed Method

66

The leaf ranges produce code that describes the transformation, which they are part of. The description of the transformation consists of the position of the range block (indicated by the descriptions of the inner nodes), position of the domain block, coefficients of the intensity transform. If the isometries are allowed, identifier of the symmetry operation also has to be added to this description (information whether or not isometries are used is stored in the file header). If the scaling coefficient of the intensity transformation is equal to zero then the codebook block location is not stored in the fractal code – it is replaced by the fixed (uniform) block at the decoder without any loss of quality.

The position of the domain block normally is stored on

d

log

2

[(

M

1

− size t

)

·

(

N

1

− size t

)]

e

bits. However, the author of this dissertation introduces the location of the codebook block to the fractal code instead specifying the location of the domain block. The change has almost no effect on the fractal compression algorithm and the file format but allows to save 2 log

2 transformation, where

CF

is the contraction factor of the

τ

C

CF

transformation.

bits per

Codebook blocks cannot be placed in any point of the image – the location of their top left vertex must be in accordance with the domain offset set up by the user.

This means that the coordinates of each domain block are dividable by the domain offset and each codebook block coordinates are dividable by

domainOf f set/CF

. This remark become the foundation of the next proposition of improvement – the coordinates of any codebook block will be translated to the coordinates system with unit size equal to

domainOf f set/CF

. This step allows to save

log

2

(

domainOf f set/CF

) per one codebook block. Therefore, because for some number of transformations, where the scaling coefficient of the intensity transform is zero, no information about codebook block location is has to be stored, almost one bit is saved per transformation.

The quantized scaling coefficient of the intensity transform will be stored on 5 bits

and the offset on 7 bits, which were provided as optimal in [Fis92a]. It is worthy of

mentioning the in the optimal hierarchical method created by Saupe et al. [Sau98] only

6 bits were used for the offset.

For the symmetry operation indicator it is enough to spend 3 bits because there are only 8 possible isometries. As it was already mentioned, it is very doubtful that the use of isometries has any positive effect. Thus, it will be also checked encoding without symmetry operations, what should help to save not only these three bits per transformation but also time because of the smaller size of the codebook. There is one more way to save some disk space.

5.5.2. New Approach

The above-presented manner of constructing the fractal code have proved itself by giving rate-distortion performance that ensured one of top places to the fractal compression methods based on HV partitions. However, the medical images create such characteristic class of digital images that one can attempt to build a file format that will suit and adapt best to the features of the class.

The idea

The fidelity of the medical images has to be preserved at very high level. At the same time medical images contain natural objects, thus it is hard to find flat regions

Chapter 5. Proposed Method

67 of the image. The fidelity of such images can be assured by adequately large number of transformations and this results in rather small range blocks because the size of the original images is restricted. This assumption was the basis of the proposed approach to encoding range blocks into binary code.

The original approach presented in previous section stores the whole partition tree.

This can be a disadvantage when there are countless levels in this tree. The smaller are the range blocks, the higher is the tree and, because of this, the overhead of the information becomes larger. The elaborated method attempts to describe only the leaves of the tree in an efficient way in order to avoid the superfluous descriptions of the non-leaf nodes.

As it was said in the introduction to this chapter, the irregular partitioning performs much better than the uniform/quadtree partitioning method. The application of adaptive quadtree partitioning to the splitting phase gave a further improvement.

Thus, one may consider creating a fractal compression method with irregular partitions based on HV partitioning instead the quadtree approach. Because the HV scheme is definitely superior over the quadtree then it might be expected that the irregular partitioning based on HV ranges will outperform irregular-region method based on quadtree scheme. The standard approach, which stores the entire partitioning tree, seems to be inapplicable because the fractal compression method based on irregular regions modifies the partitioning of the image by merging of some range blocks and the final shape of the range blocks could not be stored by writing locations of the cutting lines in the inner nodes. The proposed approach to the representation of the fractal operator can

be easily adapted to the purpose of irregular regions (explained in the section 5.5.4).

Bit Allocation per Transformation

The simplest way to eliminate the inner nodes from the fractal code is to describe the range block by storing its location and dimension. However, this would be highly inefficient because the location would take

d

log

2

(

M

×

N

is the image size and

size t

[(

M

1

− size t

)

·

(

N

1

− size t

)]

e

the range size threshold). Moreover, the size of the range would take the same amount of space. For example, after encoding an 512

×

512 image, a single range would take 2

·

18 bits for the location and dimension of the range, also 18 bits for the location of the domain block, 12 bits for the intensity transformation coefficients and 3 bits for the isometries. All this together gives 69 bits per a single

transformation. According to [Sau98] the number of 10000 ranges results in PSNR

equal to 39

.

1, what is a satisfactory result. This number of ranges gives 2

.

632 bits per pixel and 3

.

04 compression ratio, when the above-mentioned method for range block description is used. The original approach to saving the transformations gives here 6

.

47 compression ratio, 32

.

4 bits per range and 1

.

236 bits per pixel [Sau98]. This means that

the simple solution is not a good solution in this case – it performs twice weaker than the tested safe solution.

Thus, the goal of the new approach is to store the location and width of the range on smaller number of bits. This can be achieved by translating the position of a range block into other coordinate system. One way to do this is to store the relative position of the range to the previous range. The distance between two ranges is minimal or close to minimal when:

the blocks have one common border or

Chapter 5. Proposed Method

68

the blocks have one common vertex

This solution has several drawbacks. First of all this is a very weak improvement because the height and width of the ranges are still constrained only by the size of the image and the factor

CF

of the contraction transformation, i.e. they can achieve even the value of

M/CF

and

N/CF

respectively. Thus if the contraction factor

CF

is equal to 2 then only one bit can be saved on a single transformation. Another problem would be ordering of the transformation before storing to file. It can happen that two ranges that are encoded one after another are adjoining with opposite borders of the image. This, however, can be solved by allowing wrapping around the image. The last drawback of this solution is the fact that the size of the space needed for the size of the range was not reduced at all. In general, this attempt results in very week improvement and still this approach is inferior to the original one.

(a) Whole range block in one cell.

(b) The bottom right vertex in the neighboring cell to the East.

(c) The bottom right vertex in the neighboring cell to the

South.

(d) The bottom right vertex in the neighboring cell to the

South-East.

(e) The bottom right vertex in a distant cell.

Figure 5.4. The structure of the fractal code describing a single transformation depending on the position of the range block with respect to the position of the underlying cell.

The author comes with an idea of placing a grid onto the image, which tends to be much more promising. All cells in the grid shall have the same size. In order to utilize

Chapter 5. Proposed Method

69 the bit allocation to maximal extent the width and height of the cell shall be a power of two.

The location of each range will be translated to the coordinate system with the point (0

,

0) in the upper left corner of the cell that encloses the upper left vertex of the range block. This translation can be performed without any major costs. All ranges of one cell shall be grouped together in the fractal code in order to avoid placing the coordinates of the cell (in the image’s coordinate system) before each coordinates of the range block (in the proper cell’s coordinate system).

The order of traversing the cells can be made fixed at the encoder and at the decoder, and at the beginning of each group of all ranges that lie in the same cell, the number of the ranges in the group shall be placed in the fractal code. These two informations: the order of traversing and the number of ranges in each cells is enough to find the location of each cell. Thus, it is also enough to retranslate the coordinates of each range block to the image’s coordinate system. Since the bit allocation per transformation can be fully determined by the decoder, no other information is necessary.

The solution can give considerable disk space saving, which will be investigated on the example, where the image 512

×

512 is encoded to 10000 ranges. The average number of pixels in a range is about 26. When a grid with cell size 32

×

32 is put on the image then almost 40 average ranges can be packed in one cell. The location of the range blocks would be stored in this solution on 5 bits instead of 9 bits, but the number of ranges within the cell has to be stored for each cell (i.e. for each 40 ranges).

This number can be stored at log

2

[

d

32

2

/size

2

t

)

e

] bits per transformation. If the size threshold is 2 then 8 bits are needed to store the number of ranges within one cell; this gives the cost of 0

.

2 bits per one range block. Comparing to the simple solution presented before, 3

.

8 bits are saved per transformation. This is still not enough and further improvements must be made.

When there are a large number of ranges, then most of them are rather small. Thus, there is quite a chance that the entire range block will fit into one cell. This gives the opportunity of efficient storing the size of ranges. Since there will be always ranges that will lie on the edges of the cells, a single bit has to be introduced to indicate whether the bottom right vertex of the range lies in the same cell as the upper left. If this bit will be set to 1 then the coordinates of the bottom right vertex will be translated to the coordinate system of the cell of the upper left vertex. Otherwise, the location of the second vertex shall be given in the system of the image. Instead storing the width and the height of the range block, the author suggests storing the coordinates of two opposite vertices with utilization of the grid put on the image.

Next step made by the author to optimize the fractal code length is based on an observation that there can be two reasons why a range block cannot be fitted into a cell. The first reason is that the width or the height of the block is larger than the cell. Such ranges always will be possible to appear and the absolute coordinates of the second vertex have to be stored in these cases.

However, the second reason why the bottom right vertex lies on the other side of the cell border than the upper left vertex can be used to shorten the fractal code. Such situation takes place when the upper left vertex lies too close to the cell border and can happen even with the range blocks of the minimal allowed size. To eliminate the large overhead of cost of the second vertex location storage a neighborhood of the cells

Chapter 5. Proposed Method

70

Figure 5.5. Grid putted on an image. The currently processed cell and the neighboring cells are marked with triangles with labels. The order of traversing the grid is showed by the arrows in beckground.

is introduced. Instead of the absolute location of the second vertex, the neighboring cell will be indicated in which the vertex lies and the location of the vertex in the coordinate system of this neighboring cell.

When ranges with width and height smaller than the width/height of the cells are taken into consideration, there are only three neighboring cells in which the second vertices can lie. In these cells, vertices of blocks that have width or height larger than the width/height of the cell but smaller than the dimensions of the cell multiplied by two also can lie. In this situation, it can also happen that the bottom right vertex will be beyond the borders of the neighboring cells. If a range has width or height larger than the width/height of cells multiplied by two then the second vertex will always lie beyond the borders of the neighboring cells. However, it is believed that such situation will occur rarely when it comes to accurate compression of medical images.

To determine whether the second vertex lies in neighboring cell and to appoint that cell two bits are needed. Three values of four possible numbers obtained thanks to these two bits will indicate the neighboring cell – in our case the neighboring cells are: to the East, to the South, and to the South-East. The fourth value will indicate that the bottom right vertex does not lie in any of the neighboring cells. For a range block with width or height larger than the side of the cells an oversized representation is forced by described here solutions. The size of the cells should be automatically adjusted in order to make such situations really rare.

Theoretically there it is meaningless if the location of a second vertex lying in a distant cell will be expressed in the terms of the image coordinate system or if firstly the identifier of the cell in which it lies will be given and then the coordinates in the

Chapter 5. Proposed Method

71 system of that cell. However, the second solution creates new possibilities of further improvements. The number of cells is relatively small comparing to the number of pixels in the image. Thus, the cells can be treated as letters of alphabet and encoded with variable length code. It is obvious that to some of the cells will be referenced more often because the distribution and size of the range block is not uniform in adaptive fractal compression when a natural image is being compressed. Thus, there is a great chance to shorten the code length. The adaptive Huffman compression seems to be the right for the purpose. The elaborated adaptive Huffman coding method is presented in next section.

The processes of saving transformations to file and reading them from file will begin in the bottom right cell. In this cell there cannot be any ranges that do not lie entirely in the cell because the right and bottom borders of the cell are at the same time the borders of the image and the ranges that lie on the North and West borders are included to the neighboring cells instead this one. During processing ranges of this cell, no Huffman tree is present. When the processing goes to the next cell, the bottom right cell becomes the first node of the tree. The cells are being traversed in rows from right to the left and after a row is finished then the processing moves to the higher row. When finishing the saving/reading of all ranges from a given cell, the cell (as an alphabet symbol) is added to the Huffman tree. The more symbols are in the tree the longer codes are for some of the cells. However, the most frequently referenced cells have the shortest codes prescribed.

Adaptive Huffman Coding of Cell Identifiers

The Huffman coding is a lossless entropy coding methods that gives in best results in symbol-by-symbol coding. A symbol-by-symbol method ascribes code words to successive input symbols in accordance with their order in the input stream. The code words are variable length and in one-to-one relation with the input symbols. Better results of lossless compression can be achieved only with one method – arithmetic coding, which also is an entropy coding technique. However, this method cannot be applied with connection to fractal coding because it is bounded with much higher computational cost, which is already too high in the fractal method.

The static version of the Huffman coding calculates the occurrence frequencies of the symbols and then builds the Huffman tree based on that knowledge. The more times a symbol occurred, the closer the symbol is to the root of the Huffman tree; the number of occurrences becomes the weight of the symbol. The code words are created simply by traversing the tree from the root to the leaf with currently considered symbol

– at each node of the path is added a binary literal to the code word, 0 if the node is the left child of its parent and 1 in the opposite situation.

The static algorithm has two basic drawbacks. The first one is that the frequency table must be attached to the code. The second problem is the performance of the algorithm when the occurrence frequencies are similar – the compression ratio is positive only when the input symbols have oversized binary representations in the input stream.

The adaptive version of the Huffman coding algorithm solves both problems. Here the

Huffman tree is being rebuilt after encoding/decoding of each input symbol.

However, the adaptive version brings about some new problems. The first problem is how to encode a symbol that occurs for the first time in the input stream. In such

Chapter 5. Proposed Method

72

Figure 5.6. Example of Huffman tree used for constructing binary representation of fractal operator. Nodes marked with weights and indexes in the ordered list.

situation, the Huffman tree contains only a subset of the input stream alphabet containing only symbols that occurred in the already encoded part of the input stream.

There are two ways to solve it. The first one is to add a tree leaf to the tree that will represent any not-yet-transmitted symbol (NYT). When the encoder will encode some symbol for the first time then it will store the code of the NYT leaf followed by the binary representation of the new symbol (requires

log

2

|

A

|

bits, where

A

denotes the alphabet of the input stream and

|

A

|

its length). Second solution is to add all input symbols to the tree with weight 1 before starting encoding/decoding.

The proposed Huffman method differs from the basic algorithms and it is designed especially for coding the cell identifiers in the fractal code of the format described above. Here the nodes can have weights in the range (0

,

1) or any positive integer number.

The new symbols are added to the tree when there is a chance that they will be used. The order of traversing the cells during storing transformations to file ensures that no cell will be referenced during describing the location of the lower right vertex of a range block before storing all range blocks that have their upper left vertex in the cell. A new node receives the weight 1

·

10

5

. Thanks to this, the root of any subtree with all leaves with not-yet-transmitted symbols will always have weight smaller than one.

As a proof, it is enough to say that the maximal image size is 2048

×

2048, the minimal allowable cell size is 2

×

2. This gives the maximum possible number of symbols

(possible cells’ coordinates) equal to 1024. A binary tree of 10 levels height (

height

= log

2

(

leaf N o

) in a full binary tree) is required when all symbols are not-yet-transmitted

(none cell contains a second vertex of any range block). Because the weight of each inner node is equal to the sum of the weights of its children (which are equal in described situation) and the number of nodes is twice smaller as on the lower level, the sum of weights of nodes on each level is the same. Then the weight of the root of the tree

Chapter 5. Proposed Method

73

(with only not-yet-transmitted) symbols will be

weight

(

root

) =

levelN o

· leaf N o

· weight

(

leaf

) = 10

·

1024

·

10

5

= 0

.

1024.

The weight of a root of any subtree where all leaves contain not-yet-transmitted symbols cannot be larger than calculated value, thus this root will be on the same or lower level as the leaf with the symbol that occurred least frequently but did occur at least once. This fact is very important for the effectiveness of the algorithm because all symbols besides the least frequently occurred will have optimal codes just as if the codes would be created in a static Huffman coding algorithm (imposed on the already encoded part of the input stream with all already transmitted symbols as the alphabet).

The simplest way to implement the adaptive Huffman coding is to rebuild the whole

Huffman tree (in exactly same manner how it is being built in the static version) after processing of each successive symbol occurrence. This seems to be rather ineffective when there are a large number of symbols in the tree because a single weight increment might propagate only on a limited part of the tree or even it may cause no changes.

This is why a tree updating algorithm is introduced, that involves only the smallest necessary subtree that has to be transformed in order to preserve the properties of the

Huffman tree. The weights of the nodes in the tree have to grow from the left to the right on each level and from the leaves to the root.

The tree updating algorithm uses a list of all nodes that are ordered as it is showed

in the figure 5.6. When a symbol occurs on the input then the appropriate node weight

is incremented (if previously the weight was an integer number) or set to 1 (if the weight was smaller than one) – the node is denoted with

node

1

. Then it is checked whether there are elements with higher indexes but with smaller weight than the updated symbol. If such symbols exist in the list then the node containing the currently updated symbol is swapped with the node that contains the symbol with the highest index and weight smaller then the new weight of current symbol – the node is marked with

node

2

.

However, this simple refactoring can be done only if there does not exist a subtree that contains both of the nodes to be swapped. The opposite situation will happen when the

node

2 is an ancestor of the

node

1 and requires some more complicated transformation. If the

node

1 is the left child of the

node

2 or a descendant of the left child, then the

node

1 is swapped with the right child of the

node

2

. If the

node

1 is the right child of the

node

2 then no tree update is needed. If the

node

1 is a descendant of the right child of

node

2 after that the

node

1 then the children of the

node

2 are swapped one with another and

(which is already a descendant of the left child) is swapped with the right child of the

node

2

.

When the location of the

node

1 is already established then the whole algorithm described in previous section is repeated for its new parent because its weight shall also be increased. The processing is recursively repeated until the root is reached.

If the

node

1

(with the currently occurred symbol) had (before the increment) the weight equal or larger than one then the

node

2 has the same value. Thus, the weight of the previous parent of

node

1 will be unchanged. But if the

node

1 contains a symbol that occurs for the first time then the

node

2 course only values

k

·

10

5 can have any weight smaller than 1 (of are allowed). This means that the subtree containing all not-yet-transmitted symbols has to be updated with the same algorithm – starting with the

node

2

(placed in the new location) and ending on the closest to the root node with weight smaller than one.

Chapter 5. Proposed Method

74

5.5.3. Choosing the Parameters

The proposed method of constructing the binary code can give various effectiveness depending on the imposed size of the cells. The cell size is chosen by the algorithm for each fractal operator that shall be stored by estimating the number of bits per transformation through the calculation explained below.

Any transformation, independently from the location of the second vertex, always consumes some number of bits that is a consequence of the following:

the location of the top left corner of the range:

L

1

=

d

log

2

(

width

(

cell

)

· height

(

cell

))

e bits

the single bit is needed to determine if the second vertex lies in the same cell as the first vertex

L

2

= 1

bit

the intensity transformation coefficients:

L

3

= 5 + 7

bits

the location of the matching codebook block’s top left corner in the appropriate cell:

M N

L

4

= log

2

CR

·

CR bits

the descriptions of the cells (number of ranges in a cell) also result in cost that is allocated equally to all ranges:

L

5

=

1

W

·

& log

2

&

width

(

cell

)

'

· size t

&

height

(

cell

)

'

size t

+ 1

!'

·

M width

(

cell

)

·

N height

(

cell

)

bits

the header of the file contains information that is required to properly read the fractal operator from file and requires

L

6

= P

e x

=

a

L

6

x

:

the original image size:

1

L

6

a

=

W

22

·

log

2

(2048

·

2048) =

W bits

the range size threshold (maximal allowed range size threshold is 64):

L

6

b

=

1 5

W

·

log

2

64 =

W bits

the size of the cells used to create the fractal code (maximal allowed image size is 2048

×

2048):

1

L

6

c

=

W

·

log

2

(2048

·

2048) =

22

W bits

the contraction ratio used for fractal compression (maximal allowed

CF

is 3, minimal allowed

CF

is 1, the

CF

values is always an integer value):

1

L

6

d

=

W

2

· d

log

2

(3)

e

=

W bits

Chapter 5. Proposed Method

75

the number of bits used to store the scaling and offset coefficients of the intensity transformations. The number of scaling bits may take values from the range

h

2

,

10

i

and the number of offset bits from the range

h

2

,

8

i

:

1

L

6

e

=

W

6

·

(

d

log

2

(8)

e

+

d

log

2

(6)

e

) =

W bits

So, the base bit cost of a single transformation is:

L

B

=

6

X

L x x

=1

bits

The regions, which have their bottom right vertex in a different cell than the cell of the upper left vertex, carry additional costs. These costs relate only to a limited subset of transformations. The numbers of elements of the subsets depends on the size of the cell and on the sizes of the ranges. If the second vertex is placed in one of neighboring cells then the overhead is relatively small – only 2 bits. However, when it is in some more distant cell, then the overhead is equal to these two bits plus the bit length of the Huffman code identifying the cell in which the vertex lies. These additional costs can be produced by ranges divided into three classes depending on their size:

ranges with the width smaller than the width of the cells and the height smaller than the height of the cells will require additional 2 bits when it is placed closer to the bottom border of the cell than its height or closer to the right border of the cell than its width. There are no chances that such range will require more bits than in this situation.

ranges with width larger than the cell width or with height larger than the cell height; but none of these two dimensions is larger than the corresponding cell dimension multiplied by two. This class of ranges always requires additional bits to describe the location of the second vertex:

if the

x

coordinate of the upper left vertex of the range is smaller than

2

· width

(

cell

)

− width

(

R i

) and the

y

coordinate is smaller than

2

· height

(

cell

)

− height

(

R i

) then the second vertex lies in a neighboring cell and only 2 additional bits are needed

if at least one of above conditions is not fulfilled then 2 bits plus the length of the Huffman code for the cell identifier is added to the base bit length.

the last situation takes place when the width of the range is larger than the width of two cells or its height is larger than the double height of cells. In such case always will be attached the identifier of the cell in which the second vertex lies.

Thus, the bit length of the transformation is extended by 2 bits plus the length of the Huffman code.

The accurate number of bits per transformation can be met only after investigating each range block independently and summarizing the total additional bits required by the blocks. However, it is very important to be able to estimate the approximation of

Chapter 5. Proposed Method

76 this value because the size of the cell is adaptively chosen to each single image. For all powers of two, larger or equal to two and smaller than the width and the height of the image, the number of bits per transformation shall be approximately computed in order to find the best size of the cell. This approximate value is calculated based on the average size of the ranges and on all possible localizations of the range blocks that bring about additional bits to the transformation description. Simply, from all of the pixels the ones are counted that yield two additional pixel, and the ones that cause the increase of the bit length by 2 plus the length of the Huffman code for cell identifier.

The numbers of such localization, divided by the total number of pixels (i.e. the total number of possible range block localizations) result in potential percentage of the range blocks that will require 2 additional bits and 2 bits + Huffman code length.

Thus, if the average range block dimensions is smaller than the currently considered cell size then the percentage of range blocks that have second vertex outside the boundaries of the neighboring blocks is pronounced to be equal to zero (

p distant

= 0).

However, the percentage of the ranges that require two additional bits as the indicator of the neighboring cell in which the second vertex lies is positive.

p neighboring

=

avgHeight

(

R

)

· avgW idth

(

R

)

·

(

width

(

cell

)

− avgW idth

(

R

))

width

(

cell

)

· height

(

cell

)

p distant

= 0

If at least one of the average block dimensions turns out to be larger than the corresponding cell dimension, but still the range block dimensions are smaller than double cell size, then both situations can occur. Some range block localizations will increase the bit length of the transformation representation by two and the rest of the localizations will enlarge the bit length by 2 + bit length of the Huffman codes:

p neighboring

=

min

(2

· width

(

cell

)

− avgW idth

(

R

)

, width

(

cell

))

· min

(2

· height

(

cell

)

− avgHeight

(

R

)

, height

(

cell

))

/

[

width

(

cell

)

· height

(

cell

)]

p distant

= 1

− p neighboring

If the average width and average height of the range blocks are larger than the width and height of the cell respectively then all potential localizations of range blocks are bounded with the necessity of putting the cell identifier before the localization of the second vertex in the code of the transformation:

p neighboring

= 0

p distant

= 1

This estimation is performed by calculating the

L

for all pairs of values that can become the width or the height of the cells.

L

=

L

B

+ 2

· p neighboring

+

"

2 + log

2

M width

(

cell

)

N

· height

(

cell

)

!#

· p distant bits

Chapter 5. Proposed Method

77

The width and the height of the cell do not necessarily have to be equal one to another.

By allowing various values of the width and the height better adaption of the cell shape to the range blocks can be achieved, what minimizes the number of bits per transformation. One should notice that during computing this estimation the fixed length identifiers of cells are used instead variable length Huffman codes.

The result of this estimation will for sure differ from the actual bits number per transformation because the simplification is large. When the average range block is smaller than the cell, it does not mean that there cannot occur ranges that will go through several cells and that will significantly increase the number of bits per transformation. Same goes for the last situation, where it is assumed that all ranges will be stored on the maximal possible number of bits. The actual number of bits per transformation will be surly lower than the estimated one. The used measure will perform especially poorly when the image contains large flat regions and, at the same time, there are many regions of the image where a high density of details can be observed.

In this case, the real result may differ a lot from the estimated one. However, in the most typical situations the formulas shall be quite good models of the actual image reactions of the number of bits per transformations on the changes of the cell size.

5.5.4. Adaption to irregular regions coding

A fractal coding method with irregular partitioning based on HV scheme in the splitting phase would result in regions, which sides can have any length. When the uniform/quadtree scheme is employed, the range regions are not so elastic because the lengths of sides of the irregular regions are equal to the length of the smallest possible length of range block created in the splitting phase multiplied by an integer number.

Because of this, no method of describing the shape of the region created for already

existing irregular-region based coding methods can be used (see 3.1.4). The only way

to store the shape of the region when the splitting was made by HV partitioning is to store the coordinates of each vertex in the range.

The method of construction of the binary representation of the fractal operator created with the HV is nothing else but a method how to store efficiently the vertices of the rectangle blocks. The adaption to irregular regions is not very large. Firstly, a binary representation of each region has to contain information about the number of vertices in the region, which has the coordinates stored in the fractal code. Then, starting from the top left vertex, every second vertex is stored in the binary code. The coordinates of the vertex that was omitted can be calculated from the coordinates of its neighboring vertices. There are only two possibilities: the calculated vertex is above the diagonal connecting the previous with the next vertex (the

x

coordinate is same as in the previous vertex and the

y

coordinate is same as in the next vertex) or the calculated vertex is below this diagonal (the

y

coordinate is same as in the previous vertex and the

x

coordinate is same as in the next vertex). Description of all vertices besides the top left vertex must contain a single bit to indicate where the previous

(omitted) vertex lies with respect to the diagonal.

No other change has to be introduced in order to use the proposed method of describing the fractal transformations with a binary representation. The number of bits per a single transformation will be significantly larger than in the HV coding method. However, the compression method based on irregular regions is expected to

Chapter 5. Proposed Method

78 produce smaller number of ranges so the overall number of bits per pixel may be lower than in the HV coding method.

5.6. Time Cost Reduction

5.6.1. Variance-based Acceleration

The proposed method to accelerate the fractal encoding is related on block clas-

sification (section 4.4), codebook reduction (section ch4cr) and excluding impossible

matches (section 4.6). The work of He, Yang and Huang [CH04] is the basis of the

proposed method.

There are only two classes:

shade blocks

ordinary blocks

The classification is made based on the intensity variance of the blocks. If the variance value is smaller than the value given by the user, the range block will be approximated by a fixed block – no search after matching codebook block is made.

The “ordinary” blocks are the ones for which the codebook blocks have to be found.

Lee and Lee [Lee98] state that the distance between the range and codebook block

is very close connected to the distance between the variances of the blocks. Smaller variance distance between two blocks should result in smaller error between them.

To measure the variance the following equation is used:

σ

(

B i

) =

1

B i

B i

1

X

j

=0

b

2

j

B i

1

X

2 

b j

i

=0 where

B i

block

B i

.

=

b

0

, . . . , b

B i

1 is the measured block and

B i

is the number of pixels in the

Before the encoding process is started, the user sets the variance value ∆

σ

that will delimit a subset of the codebook that shall be considered during the search. The subset will contain different codebook blocks depending on the range block that currently is being encoded.

All codebook blocks that have variance larger than the variance of the range block plus the variance value given by the user are omitted in the search – they are treated as blocks that cannot yield error smaller than the tolerance criterion.

Because the scaling coefficient of the intensity transformation is restricted to values from the range

h

0

,

1

i

, also all codebook blocks, which have variance smaller than the current range block, can be removed from considerations.

Thus, for each range block a subset of codebook is dynamically created. The subset contains all codebook blocks that potentially can be matched with the range block and no other codebook block is considered for the range block. The content of the codebook subset is:

C

C

R i

=

{

C j

:

σ

(

C j

)

>

0

σ

(

R i

) +

t

σ

< σ

(

C j

)

}

where:

C

– the entire codebook,

C

R i

shall be compared with range block

R i

– the subset of the codebook with blocks that during the search.

Chapter 5. Proposed Method

79

5.6.2. Parallelization

The encoding of a process can be accelerated by parallelization. Parallel processing is very promising because the affine transformation that composes the fractal code can be found independently. The parallelization is possible also in the proposed encoding algorithm where the user can fix the number of threads that shall be used to the encoding.

Before the encoding starts, a pool of threads is created with the given number of threads. The threads share a queue of range blocks that await to be encoded – the first range block is added to this queue by the main thread and it covers the whole image.

Each thread consumes one range block from the beginning of the queue and performs the entire encoding process, i.e. determines the codebook for the range block, finds the best matching codebook block, computes the intensity transformation parameters and the distance between the range and codebook blocks. If the error between the range block and the best matching codebook does not fulfill the tolerance criterion then the range block is split within the same thread and the newly created range blocks are added to the common queue. However, if the error between the range block and the transformed codebook block meets the tolerance criterion then the range block (with the description of the affine transformation) is added to the thread’s internal data structure.

All threads end their life when the queue with range blocks to encode is empty and there is no active thread that can produce new range blocks – the main thread monitors the state of the threads and the number of elements in the queue and interrupts the threads if the encoding is finished. The last action the threads perform before the threads are terminated is sending the structure with the encoded range blocks to the main block.

In the program there are only two instances of the image – the first instance is used by all range blocks and the second one is contracted and used by the codebook blocks. Thus, the threads share not only the queue with the uncovered range blocks but indirectly also the images.

The algorithm of the main thread is as follows:

1

2

3

1

2

3

4

5

6

7 declare and initialize the queue and the array create a range with location (0,0) and size equal to the image size and add it to the queue create given number of threads that perform the proper encoding and launch them wait for signal from any thread if the queue is empty and all threads are in ’idle’ state then go to next step, otherwise go to previous step get the arrays of encoded ranges and merge them into a fractal operator store the binary representation of the fractal operator to the file

The threads that are runnin parallel perform the following algorithm: wait until the queue in the main thread is not empty set the state of this thread to ’processing’ get the first element from the queue

Chapter 5. Proposed Method

80

4

5

6 find the best matching codebook block, the intensity transform and the error if the error fullfils the tolerance criterion then add the range block with all transformation parameters to the inner array of range blocks otherwise split the range block and add the new range blocks to the queue in the main thread set the state of this thread to ’idle’

5.7. Summary of the Proposed Compression Method

All issues of the design are solved in several approaches in order to find the optimal solutions.

The proposed method uses the Horizontal-Vertical partitioning [Fis95c]. The rea-

sons of this choice are very good results of this approach and lack of any other approach

(in the literature) that would give better results at low compression ratios.

However, there are considered several ways to divide range blocks during encoding

(when matching codebook block cannot be found). The work of Fisher [Fis95c], and

the work of Saupe, Ruhl, Hamzaoui, Grandi and Marini [Sau98] is here the basis for

creating following division methods:

Edge detection with Fisher’s rectangle degradation prevention mechanism

Variance minimization with Saupe’s rectangle degradation prevention mechanism

Edge detection with Saupe’s rectangle degradation prevention mechanism

Variance minimization with Fisher’s rectangle degradation mechanism

Edge detection with flat rectangle degradation mechanism

Variance minimization with flat rectangle degradation mechanism

The splitting technique using the edge detection is taken from [Fis95c] and im-

proved. The approach based on variance minimization is the same as in [Sau98]. Fisher’s

and Saupe’s rectangle degradation mechanism are exactly the same as in the original works. The flat mechanism is a new approach that allows checking whether the rectangle degradation prevention mechanisms have a positive effect in the image quality.

The codebook in fractal compression is “virtual”, i.e. there is no need to store the codebook blocks in a separate data structure because they can be taken from the image that is being encoded. Nevertheless, some additional information of the codebook blocks has to be stored in the memory during the encoding, like the location of the block, its size and its inner product’s values (which requires time-consuming calculations).

Depending on how this additional information is created and stored in the memory, there are distinguished several codebook types:

on-the-fly codebook (light codebook) – the additional information is created every time a codebook block is accessed (during encoding) and it is not stored for further processing. The application stores only the position of the next codebook block position.

solid codebook – the additional information about any possible codebook block is created in a preliminary phase, independently whether the codebook block will be accessed at least once during the encoding or not. This information is stored for entire encoding process. The calculations of codebooks’ inner products and

Chapter 5. Proposed Method

81 variances can be made during this preliminary phase or they can be performed when the blocks are accessed for the first time.

hybrid codebooks – two types that attempt to merge the advantages of the two above-listed codebook types and give a whole new value:

Solid codebook filled during the encoding with the light codebook.

Solid codebook used for range blocks smaller than some size and the light codebook for all other range blocks.

The realization of the isometries was proposed by the author in section 5.4. The

use of symmetry operation is optional, what allows investigating if there is any profit from the isometries. This allows resolving doubts about this topic (presented in section

3.3.2).

There are two methods of creating the fractal code. The first one is very similar to

the methods presented in [Fis95c] and [Sau98]. The difference is that instead storing

the locations of the domain blocks the locations of the corresponding codebook blocks are stored. This saves a single bit per transformation but the full search (the maximal size of the domain pool, i.e. domain offset equal to 1) is not possible. This approach stores the entire partitioning tree in the file. The author proposes a second approach, which stores only the leafs of this tree, i.e. only the range blocks that were matched with codebook blocks are stored. This approach can be used not only in the fractal compression method based on HV-partitioning but also in a method based on irregular regions created with the use of HV-partitioning.

There are proposed two methods for accelerating the encoding. The first one uses blocks’ intensity variance. The range blocks with low variance can be treated exactly the same as flat range blocks – no search after matching codebook block is required because the range block can be well approximated with a block of fixed intensity. The other aspect of the variance-based acceleration eliminates from the search all codebook blocks that have variance lower than the variance of the currently being encoded range block. The codebook blocks that have variance much larger than the range block also can be eliminated from the search with this acceleration technique.

The second acceleration technique parallelizes the encoding. The proposed parallelization scheme created a pool of threads that encodes independently range blocks.

Each such thread can encode one range block at the time. The threads share the following data: the image, the contracted copy of the image and a queue of range blocks to be encoded.

Chapter 6

Experimental Results

The application WoWa Fractal Coder implements the proposed fractal compression method and provides an easy to use graphical user interface. More information about

the application can be found in appendix C.

Whole application is written in Java (JDK 1.6). The GUI is created with Swing

library. The look and feel of the application is made with help of the Substance library

1

.

The implementation allows numerous alternative versions of the encoding algorithm. The version can be picked by configuring the application before starting the encoding process.

The encoder settings that significantly influence the algorithm are:

method used to divide blocks:

a block divided along the most significant horizontal or vertical edge

a block divided along a line that gives minimal sum of intensity variance of resultant blocks

rectangle degradation prevention mechanism

Fisher’s mechanism

Saupe’s mechanism

Flat mechanism

whether or not the codebook will be extended by allowing isometries

the contraction factor

CF

of the contraction transformation

τ

C

if the decoding shall be performed through Single-Time Quantization or by standard algorithm

which type of the codebook shall be used

There are also a number of settings that can be enabled in order to accelerate the encoding process:

whether the algorithm shall find the best matching codebook block for all considered range blocks, or a codebook block that fulfills the error tolerance criterion is enough

if the search for transformations shall be performed in sequence – range after range or in parallel by enabling multithreading

if the variance acceleration shall be used

1. more information at project website: https://substance.dev.java.net

Chapter 6. Experimental Results

83

blocks with variance lower than what value shall be treated as shade blocks

(approximated by a fixed codebook block)

codebook search space restriction – what is the maximal variance difference between the range block and the codebook blocks that shall be considered during the search for matching codebook block

whether or not the solid codebook shall be filled on the fly (from the light codebook)

(a) Original 532

×

434 image

(b) Fractally encoded and decoded image. PSNR =

40

.

01, CR = 8

.

92.

Figure 6.1. An example medical image and its fractally encoded version.

The influence of the settings on the encoding process and its output was checked, especially their influence on the fidelity, compression ratio and the encoding time.

Doppler ultrasonography images were used for the tests. The image is presented in the

figure 6.1(a). The image, which is a monochrome image with 8 bits used for the pixel

Chapter 6. Experimental Results

84 value, size is 532

×

434. All tests were performed on the same PC machine with Intel

2 Core CPU 2 GHz and 1 GB RAM.

6.1. Block Splitting

According to the performed experiments, the choice of the block splitting technique and rectangle degradation prevention mechanism has large effect on the rate-distortion performance of the algorithm. The performance of the six options described in section

5.2 is presented in the figure 6.2.

(a) Rate-distortion functions

(b) Encoding time and image fidelity relationships

Figure 6.2. The proposed compression algorithm performance depending on the block splitting mechanism and rectangle degradation prevention mechanism.

ED

denotes the splitting mechanism based on edge detection and

VM

the mechanism based on variance minimization.

Chapter 6. Experimental Results

85

The best results are given by the modified Fisher’s approach, i.e. the splitting technique based on the most significant edge in the block intensity with the original

Fisher’s rectangle degradation prevention mechanism.

However, there are two other configurations, which give lower noise level at very low compression ratio. Both of these methods are based on the edge detection. When

Saupe’s rectangle degradation mechanism is used, then this approach is superior to the edge detection with Fisher’s mechanism at compression ratios lower than 3

.

5. And when the flat rectangle degradation mechanism is used then it is better at compression ratio lower than 2

.

3.

Compression ratios lower than 2

.

3 or even than 3

.

5 can be obtained with lossless image coding methods. This means that at this range of the compression ratios any lossy compression method will never be used because the same amount of disk space can be saved without losing any information.

However, when the image is encoded in order to decode it in a higher resolution

(fractal magnification) then the compression ratio is insignificant. The edge detection splitting technique with Saupe’s rectangle degradation mechanism seems to be appropriate for this purpose because it allows achieving the highest PSNR, which is equal to 52

.

07 dB.

Surprisingly, low rate-distortion performance can be observed when Saupe’s block splitting technique is used – it is never higher when the approach based on edge detection combined with Fisher’s rectangle degradation prevention mechanism.

Besides the fidelity of the compressed image, also the encoding time is important.

Here, all splitting methods that are based on the variance minimization approach are the slowest ones.

From the splitting methods based on the edge detection, the use of Saupe’s rectangle degradation mechanism results in best encoding time. To obtain the PSNR equal to 40 dB it needs about 12 minutes while the approach based on edge detection with Fisher’s rectangle degradation prevention mechanism requires 18 minutes and 20 seconds and the approach based on the variance minimization with Saupe’s mechanism consumes over 23 minutes. This comparison once again suggests that the combination of the edge detection approach with Saupe’s rectangle degradation mechanism can be very useful in the image magnification where the user surely does not want to spend too much time to zoom an image.

At compression ratios from 4 to 13, this combination is characterized by the second good noise-distortion. Thus, because of the advantage of the shortest encoding time, the combination of the block splitting mechanism based on edge-detection and Saupe’s rectangle degradation mechanism can be a reasonable choice also when compression is performed in order to reduce the file size.

The flat rectangle degradation prevention mechanism was introduced in order to check if any such mechanism gives benefits. Both Saupe’s and Fisher’s mechanisms favor the cutting line localizations that are closer to the middle of the block to be divided. It turns out that the application of these mechanisms decreases the number of transformations, i.e. it increases compression ratio. In both splitting approaches, the flat rectangle degradation mechanism, gave the worst results.

From these experimental results, there are following important conclusions:

Chapter 6. Experimental Results

86

the edge detection technique to find the cutting line for the split outperforms the block variance-minimization approach

the use a rectangle degradation prevention mechanism causes positive effects in all aspects of fractal compression performance

When the Fisher’s mechanism is used then better rate-distortion relationship can be observed but when Saupe’s mechanism is chosen then the encoding lasts shorter.

6.2. Number of Bits for Scaling and Offset Coefficients

Low number of bits for intensity transform coefficients results in lower number of bits per transformation. However, the coefficients can take only few values in such situation, what can result in difficulties with founding quantized coefficient’s values that will produce error between range and codebook blocks smaller than the given threshold

(distance tolerance criterion). Thus, low number of bits for coefficients may increase the number of transformations in the fractal code. The higher number of transformations there are the longer time was needed to pair the range blocks and codebook blocks.

In order to find the number of bits that will balance the transformation’ code length and the number of transformations (and the encoding time) a new factor is proposed, which takes into account also the PSNR:

F

(

b

) =

P SN R

(

b

)

CR

(

b

)

/EncodingT ime

(

b

) where

b

denotes the number of bits intended to store the scaling/offset coefficients in the code of a single transformation.

The maximal value of the function

F

(

b

) shows the optimal number of bits that will balance the three most important factors in the fractal compression: noise level, compression ratio and encoding time. Of course, the factor

F

shall be calculated for any possible number of bits for the coefficient and no other settings shall be changed.

(a) Scaling coefficient (b) Offset coefficient

Figure 6.3. Test of the bit allocation for the intensity transformation coefficients.

Chapter 6. Experimental Results

87

The test was repeated for different values of the distance tolerance criterion in order to check whether the optimal bit allocation remains the same for different compression ratios. The considered tolerance criterion values were:

• error t

= 0

.

05

when the bit allocation of the scaling coefficient is investigated then

CR

(1

.

29

,

1

.

6)

when the optimal bit allocation for offset coefficient is searched then

CR

(1

.

43

,

1

.

7)

• error t

= 1

scaling –

CR

(2

.

32

,

2

.

39)

offset –

CR

(2

.

17

,

2

.

6)

• error t

= 5

scaling –

CR

(5

.

02

,

5

.

24)

offset –

CR

(4

.

25

,

5

.

62)

• error t

= 50

scaling –

CR

(24

,

26

.

2)

offset –

CR

(21

,

26

.

7)

The normalized value of the factor is presented in the figure 6.3. For very low

compression ratios, the optimal numbers are: three bits for scaling and two for offset.

But this case is exceptional because for higher compression ratios the optimal number of scaling bits falls into the range

h

4

,

6

i

and the number of bits indicated by Fisher as

optimal ([Fis95b]) always gives at least the second best result. The optimal number

of the offset bits is in the range

h

6

,

8

i

what also confirms that the number 7 can be treated as the golden mean.

6.3. Coding the Transformations

In the section 5.5 two different approaches to the construction of the fractal code

were presented. The first one stores the whole partitioning tree (information about all range blocks – the ones that have matching codebook block as well as the ones that had to be divided) while the second one tries to efficiently store only the leafs of this tree (the range blocks that have a bounded transformation).

Their performance was measured and it is presented in the figure 6.4.

The standard approach (whole partitions tree) turns out to be much more effective when the partitioning and search processes produce a small number of transformations

– e.g. when the error tolerance criterion is not to restrictive. Although the new approach always turns out to give longer fractal code, when high fidelity is preserved (what founds its reflection in many small range blocks) the difference between the two approaches is rather small – about two bits per transformation.

This means that the new approach can give very good results when it is employed in fractal compression method based on irregular regions created from HV-partitions.

The first phase of the irregular partitioning, where the image is divided into many blocks with low variance (splitting phase), is actually a very similar to pure hierarchical partitioning with maximal fidelity. And in such circumstances, the new approach to store the transformations gives best outcomes.

Chapter 6. Experimental Results

88

(a) Scaling coefficient

(b) Offset coefficient

Figure 6.4. Test of the bit allocation for the intensity transformation coefficients.

6.4. Acceleration

The encoding process in the above tests took at least 18 minutes (algorithm variant based on edge detection with Fisher’s range degradation prevention mechanism) or

11.5 minutes (edge detection with Saupe’s range degradation prevention mechanism) when the PSNR is slightly higher than 40 dB.

The encoding can be accelerated in several ways:

reducing the virtual codebook size

quitting the search process (the process of searching a matching codebook block for given range block) after encountering the first codebook block that fulfills the error tolerance criterion

choosing a faster approach to fill and store the virtual codebook

enabling variance-based acceleration:

Chapter 6. Experimental Results

89

treating range blocks with low intensity variance as shade blocks (approximated with fixed codebook blocks)

performing the search only on a subset of codebook containing the blocks – the selection of the codebook blocks is made by setting the maximal distance of variances of the range block and the codebook blocks.

6.4.1. Codebook Size Reduction

The first way to reduce the codebook size is to resign from the symmetry operations.

The encoding time of a single range block is eight times longer when the isometries are enabled. When the domain offset is equal to 2 then it takes over 30 hours to encode the test image (532

×

434). Such encoding time is very unacceptable, especially because the increase of the PSNR is rather marginal. All tests mentioned here were performed with isometries.

A larger offset between domain blocks locations also reduces the codebook size.

This operation very significantly shortens the encoding time. The fidelity of the image

and encoding time at higher domain offset values are presented in figure 6.5.

(a) Domain offset – PSNR relationship.

(b) Domain offset – encoding time relationship.

Figure 6.5. Influence of the codebook size on the PSNR and the encoding time.

For the domain offset equal to the smallest possible value, i.e. 2, when the codebook is the largest, the PSNR reaches the highest value for the variant of the algorithm based on edge detection and Fisher’s rectangle degradation mechanism. The PSNR value is equal to 50

.

67 dB but the time cost is high – about 50 minutes. Increasing the domain offset from 2 to 4 reduces four times the codebook size and in consequence results in four time shorter encoding time (12 minutes 10 seconds). Further increasing the domain offset (from 2 to 6) reduces the encoding time to 5 minutes 32 seconds (almost 9 times improvement). Domain offset equal to 8 results in encoding time: 3:09 (almost 16 times improvement).

These very significant savings of time do not have very negative consequences in the image fidelity. When the domain offset is 4 then the PSNR is only 1

.

2 dB (2

.

5%) smaller comparing to the results of encoding with domain offset equal to 2. Setting

Chapter 6. Experimental Results

90 domain offset to 6 reduces the PSNR by 2

.

4 dB (4

.

7%) and setting it to 8 gives PSNR smaller by 3

.

3 dB (6

.

6%). In all of these examples, the PSNR is at a very high level – over 47 dB.

An interesting example is also the value 12 of the domain offset, which allows encoding the test image with PSNR equal to 46

.

1 dB in only 1 minute 20 seconds.

Not only the codebook size influences the image fidelity – as same important is the content of the codebook. This is the reason why the PSNR does not consequently decrease with increasing domain offset.

(a) Distortion–time relationship.

(b) Rate – distortion relationship.

Figure 6.6. Best match searching versus first match searching.

6.4.2. Breaking the Search

The error tolerance does not have any influence on the subcodebook (subset of the codebook that contains all codebook blocks that are compared with a given range

Chapter 6. Experimental Results

91 block) size in the original approach. In this situation, a higher error tolerance value can only reduce the encoding time indirectly by reducing the number of range blocks.

However, when the search is interrupted after finding the first codebook block from the subcodebook that yields error smaller than the error tolerance (FM – First Match), then the search process for a single range block also is shortened at error tolerance values higher than 0.

But when the search is broken in the above-described way, not all range blocks are paired with the best matching codebook blocks. What is the gain of the encoding time

and the cost of the PSNR, can be seen in the figure 6.6.

Breaking the search definitely allows to obtain compressed image with same PSNR in shorter time – the 40 dB can be achieved in almost two minutes (10

.

7%) less time.

However, it has a negative influence on the rate-distortion characteristic of the algorithm. For example, when the best match is searched then the resulting image produces

41

.

3 dB at compression ratio 7

.

4 : 1, but the “first match” option would give only 39

.

6 dB at the same compression ratio. This cost of image fidelity cannot be ignored.

Figure 6.7. Performance of different codebook types.

Chapter 6. Experimental Results

92

6.4.3. Codebook Optimization

5.3.

The discussion of different approaches to the codebook was presented in the section

The shortest encoding time can be achieved with the hybrid solution, which uses the light codebook to fill the solid codebook during encoding. When the second hybrid approach is put on such codebook, i.e. the solid codebook contains only the blocks smaller than given value and the light codebook provides larger blocks, the results are equally good.

The solid codebook with inner products calculated in the preliminary phase gives very weak results, even when its use is restricted to the smallest codebook blocks. This confirms that a large set of the codebook blocks, which potentially can be used during encoding, is unnecessary because the partitioning process does not produce any range blocks of the same size as these codebook blocks.

However, it is profitable to remember the inner products of the codebook blocks that were compared with a range block. The inner products calculations contribute remarkably to the time cost of the fractal encoding. Even when the pure solid codebook is used, but the inner products are postponed to the first access to a codebook blocks, the acceleration is impressive.

6.4.4. Variance-based Acceleration

The acceleration method described in section 5.6.1 gives outstanding results. The

graphs 6.8 and 6.9 present the percentage of the encoding time that can be saved thanks

to the variance-based acceleration. However, degradation in the image fidelity also can be observed – it is presented by the percentage of the PSNR value calculated for the same image encoded with the same settings but without variance-based acceleration.

Figure 6.8. Influence of the classification to ”shade” and ”oridinary” blocks on the encoding time, compression ratio and PSNR.

The figure 6.8 presents the results for the encoding the test image with the error

tolerance set to 0 but at higher values of error tolerance (higher compression ratios,

Chapter 6. Experimental Results

93 lower image fidelity) the dependences of the presented values remain the same. The higher is the limit of the shade blocks variance, the higher is the reduction in PSNR.

However, it is always exceeded by the savings of the encoding time. It can be noticed, that the use of the shade blocks give also profits in the compression ratio. When a range block approximated with a fixed block is stored to the fractal code, there is no need to store the information about the codebook block (location, isometry) – several bits are stored per each range block that becomes the shade block.

Even better results are given by eliminating the codebook blocks with variance lower than currently processed range block or with too high variance.

Figure 6.9. Performance of the codebook block elimination based on the blocks’ variance difference.

The reduction of the PSNR by 1 dB results in almost two times smaller encoding time. The encoding time can be shortened to 2

/

3 of the original time by setting very large variance distance between range and codebook blocks – for the test image this value was equal to 3000 and caused a negligible loss of fidelity – 0

.

3 dB of PSNR.

Too restrict selection of the codebook blocks to be compared with the range blocks may cause problems with finding any codebook blocks that will fulfill the conditions

(variance-based selection and range block-codebook block error). If such situation occurs then some range blocks will not be encoded and if the range blocks will be too small to be divided then not whole image will be covered with transformations – the

image fidelity will drastically fall down. It can be visible in the figure 6.9 – the variance

selection criterion from the range

h

30

,

60

i

results in one range block that cannot be encoded and lower values of the criterion give much more such range blocks.

6.4.5. Parallelization

The tests were performed on machine with dual-core processor. Thus, it is possible to check what the speed-up is when the encoding is made with two threads.

Theoretically, the speed-up should be equal to the number of processors. At PSNR equal 40 dB the encoding time is decreased exactly 1

.

95 times., so the actual speed-up is very close to ideal.

Chapter 6. Experimental Results

94

6.4.6. Spiral Search

According to the literature, the order of traversing the codebook blocks has a sig-

nificant impact on the encoding time (section 3.2.2). The reduction of the encoding

time through applying the spiral search is possible only when the probability of finding matching codebook block grows while spatial distance between the codebook block and range block is decreased.

In order to verify that the spiral search can be save some encoding time also when medical image is compressed, the histogram of spatial distances between range and codebook blocks that were paired into a single transformation. Although the histogram was created basing on the test image, it looks similar also for other Doppler USG images.

Figure 6.10. Histogram of spatial distances between matching codebook blocks and range blocks.

The histogram shows that only in the closest neighborhood of the range block, when the spatial distance is smaller or equal to 2 pixels, some small increase of the likelihood of matching the range block and codebook block is present. However, the most probable is to find a matching codebook block that is about 209 pixels (the size of the image is 532

×

434) away from the range block. The probability of the distance is concentrated around the maximum point and decreases in the shape of a trinomial.

The observations allow to state that the improvement in the encoding time made by the spiral search would be insignificant or there would be none. Some improvement may be only achieved when the codebook blocks that are located too far would be eliminated

– during the encoding of the test image, no range block was paired with a codebook block that finds farther than 648 pixels. A good value of the spatial range-codebook block distance limit would be in this case 550 or 500 because very small percentage of found transformations are requiring that range of the distance. This fact confirms that the

Restricted Search Area

may have positive influence on the encoding time but the size of the area has to include the most part of the image in order to preserve high image fidelity.

6.5. Comparison with JPEG

The JPEG/DCT compression is lossy – like the fractal compression. It is a good reference point for any lossy compression method because it is included to the DICOM

(the Digital Imaging and Communications in Medicine) standard.

Chapter 6. Experimental Results

95

The JPEG standard was tested on five images including the test image from pre-

vious sections. Other images can be seen in appendix A. The comparison of the two

methods can be seen in the figure 6.11, where average results for the five images are

presented.

The USG images are accurate enough when the compression ratio gained with

JPEG algorithm is not higher than 9 : 1 (see section 1.2). Other sources indicate that

the PSNR of the reconstructed medical image shall be higher than 40 dB [Mul03]. The

two requirements are coincident because at compression ratio 9 : 1 average PSNR value obtained with JPEG algorithm is equal to 40 dB. The proposed fractal compression algorithm gives slightly worse results – PSNR equal to 40 dB is attainable at compression ratio about 7

.

5 : 1. Compression ratio equal to 9 : 1 produces here 38

.

75 dB of

PSNR.

(a) Peak Signal to Noise Ratio (b) Mean Squared Error

(c) Image Fidelity (d) Mean Absolute Error

Figure 6.11. Comparison of the proposed fractal compression method and JPEG according to different objective measures.

In order to gain full knowledge of the fractal compression performance in comparison to JPEG standard, the fidelity of the decompressed images were measured not only with PSNR and MSE measures but also with following objective measures:

Chapter 6. Experimental Results

96

Image Fidelity (IF)

IF

= 1

M −

1

X

N −

1

X h

X

(

m, n

)

− m

=0

n

=0

˜

(

m, n

) i

2

/

M −

1

X

N −

1

X

[

X

(

m, n

)]

2

m

=0

n

=0

Mean Absolute Error (MAE)

M AE

=

1

M

·

N

M

1

X

N

1

X h

X

(

m, n

)

− m

=0

n

=0

˜

(

m, n

) i

All objective measures indicate that proposed fractal compression method outperforms the JPEG algorithm only at high compression ratios. According to PSNR measure, fractal compression is better at compression ratios higher than 18 : 1. Same conclusion gives analysis of MAE. However, results of IF and MSE show that fractal compression is better at compression ratios higher then 14, 15 dB.

Thus, proposed fractal compression method is superior to JPEG algorithm at compression ratios higher than 18 : 1 and it is inferior at compression ratios lower than

14 : 1. In both compression methods, compression ratio between 14 : 1 and 18 : 1 is bounded with similar fidelity of the reconstructed image.

According to the literature [Bel02, Che98], the encoding time of fractal method is

much more extensive (at least several dozen times) than in JPEG algorithm. However, detailed experimental measures do not have to be performed in order to achieve the goal of the thesis.

6.6. Magnification

The fidelity of a fractally magnified image is measured indirectly because there is no objective error measure that allows comparing image of different size. Because of this, when the magnified image ˙ is created, it is encoded (with same method and settings) and decoded to the dimension of the original image

X

– the created image will be denoted with ¨ . Instead measuring the PSNR between

X X

the distance will be measured between

X

and ¨ .

X

A × B zoom in

−→

η times

η

·

A

×

η

·

B zoom out

−→

η times

A

×

B

Analogous calculations are made for the bilinear and bicubic interpolation. Figure

6.12 presents results of the first magnification test that was performed on six images of

different size. All of the six images are parts of the test image 6.1(a). Each image was

magnified two, four and eight time. The encoding settings may vary between images and magnification factors. The fractal magnification outperforms both interpolation methods. Only at zoom equal to four times, the quality of the image magnified with bicubic interpolation is close to the fractally zoomed image. The largest difference can be observed at two-times magnification – close to 1 dB.

Results of above experiment are affected by imperfection of the method used to measure the fidelity of magnified images. During magnification (

X

−→

˙

) as well as during demagnification ( ˙

−→

X

) the range blocks may have size 2

×

2 or larger. This

Chapter 6. Experimental Results

97

Figure 6.12. Comparison of fractal magnification with interpolation methods for different magnification ratios.

setting of the minimal range block size gives best image fidelity but also causes problems during demagnifying. When the image ˙ is demagnified, sizes of all range blocks are reduced as many times as many times the image is demagnified. A 2

×

2 range block is degraded to a single pixel when the image is demagnified two times. However, same size range blocks are rescaled to zero pixels when the image is demagnified four or more times. Thus, during decoding not whole fractal operator is used. The transformations with smallest range blocks (with at least one side of the range block smaller than half of the demagnification ratio) do not have any influence on the reconstructed image.

Although the results of four and eight times fractal magnification (presented in figure

6.12) are lowered by this imperfection of the comparison method, they are still better

then the results of the interpolation methods.

The second magnification test was performed on a larger set of images. The images have various sizes and they were created by cutting of parts of five different ultrasonograms. The magnification methods were tested on forty created in such manner,

average values of the results are presented in table 6.1 and in figure 6.14. An additional

measure was used – maximal error (ME) which returns the maximal difference between corresponding pixels of images

X

and ¨ . In the figures as well as in the table the results of the fractal magnification are confronted only with better magnification method based interpolation – bicubic interpolation.

The goal of this test was to produce experimental results independent of characteristics of a single image but also to find a configuration of fractal magnification settings that would give results close to optimal for any ultrasonography image.

The results of the second test confirm the superiority of fractal magnification over interpolation methods. All objective measures, besides maximal error (ME), indicate that the fidelity of fractally magnified images is higher than the fidelity of magnified images through bicubic interpolation. The maximal error is higher for fractal magnification because the distribution of the pixels errors is different in the methods. Although

Chapter 6. Experimental Results

Table 6.1. Comparison of fractal magnification and bicubic interpolation method PSNR MSE IF ME MAE fractal (optimal TC) 37.77

12.17

0.9982

42.00

1.78

fractal (TC = 1) bicubic

37.58

36.58

12.68

16.69

0.9982

0.9978

42.23

22.4

1.82

2.57

98 the overall error caused by the fractal magnification is lower than the one caused by bicubic interpolation, there can be observed rare pixels where the errors are higher.

The interpolation decreases the brightness of the image during magnification, while the fractal magnification most often increases the brightness.

Based on the experimental results obtained in the second magnification test also

the dependence of the image fidelity on the image size was investigated (figure 6.13).

When the image is smaller, the virtual codebook contains less blocks and it is harder to find a good match between range blocks and codebook blocks. Thus, the larger an image is, the better quality of the magnified image can be obtained. The quality of the image is significantly lower when images 32

×

32 are magnified.

Figure 6.13. Fractal magnification and interpolation methods for different image sizes.

Nevertheless, the decrease of image fidelity caused by too small size of the original image is smaller when fractal method is used comparing to bicubic interpolation. The image fidelity is lowered in bicubic interpolation because during calculating pixel intensity an incomplete context of the pixel is used. This concerns only pixels close to border of the image but when the image is smaller, the relative number of such pixels is higher.

The fidelity, quality and usefulness of images magnified with fractal method are higher than with bicubic interpolation according to subjective assessment made with cooperation with medicine physicians. Visibility and fidelity of small details, readability of edges in the image and lack of image blurring are the main reasons that state the superiority of fractal magnification. The quality of the fractally magnified image is lowered by blocks effect but it does not cause problems with reading the image. The

blocks effect can be reduced by introducing postprocessing (see section 3.6).

Chapter 6. Experimental Results

99

(a) Peak Signal to Noise Ratio

(b) Mean Squared Error

(c) Image Fidelity

(d) Mean Absolute Error

Figure 6.14. Comparison of the proposed fractal compression/magnification method and interpolation methods according to different objective measures.

Chapter 6. Experimental Results

100

(a) Original 128

×

128 image.

(b) Fractally magnified image (

P SN R

= 36

.

94

dB

).

(c) Magnification through bicubic interpolation (

P SN R

=

35

.

94

dB

.

Figure 6.15. Fractal magnification and bicubic interpolation examples.

Conclusions

Research Question 1: Is it possible, and how, to minimize the drawbacks of the fractal compression method to satisfying level in order to apply this method to medical imaging?

The two main drawbacks of the fractal image compression that have to be eliminated in order to make possible the use of if in medical imaging are:

1. too large loss of information

2. very long encoding time

Any irreversible compression method forces to face up the first problem because the medical images have to be very accurate. The satisfying level of the distortions for

medical images was mentioned in section 6.5. The experiments proved that the proposed

fractal compression method can obtain this level (

P SN R

= 40

dB

). Furthermore, an image of this quality is compressed at compression ratio about 7

.

5 : 1. Thus, the use of fractal compression reduces the file size about two times efficiently than the lossless compression methods (compression ratios not larger than 4 : 1). At compression ratios from 4 : 1 to 9 : 1 the PSNR varies from 40

dB

to 44

.

2

dB

The accuracy is as high because the proper fractal compression method is used. The choice of the proper method could not be possible without the meticulous review of the fractal compression method presented in this thesis.

The second problem can be solved in a variety of ways that were described in the thesis. Several of them were implemented and some new propositions were made to optimize the codebook operations. All of these efforts gave satisfactory results. It is possible to encode a mediocre ultrasonography image in few seconds or several dozen of seconds and a 256

×

256 image in about 5 seconds and the required image fidelity is preserved (2 GHz processor, 1 GB RAM). The main means to reduce the encoding time of the hierarchical horizontal-vertical fractal encoding to a reasonable value can be encapsulated in following points:

establish the codebook on the fly – this will prevent from creating codebook blocks

(and computing inner products) that will be not used by any of the range block,

reduce the codebook size to the minimal size that will give desired fidelity

parallelize the encoding if there is more than one processor

Conclusions

102

use variance-based acceleration:

exclude form the search the codebook blocks with intensity variance much larger than the variance of the range block

treat range blocks with low variance equally as flat blocks

It is worth of mentioning that the proposed fractal coder has been written in Java and compiled to byte code. The byte code is interpreted by the Java Virtual Machine.

An improvement in encoding time could be achieved by compiling the program to platform-dependent machine code. The native machine code from java source files can be created with the

gcj

compiler

2

, however it does not support Java 1.6 at the moment.

Other way to get the machine code is to use one of the Just-in-Time Compilers, e.g. the

IBM JIT Compiler, which re-compile the byte code to native code. The Java-programs, even when compiled to machine code, are slower than the programs written in C/C++.

programming language results in best performance. Thus, use of C/C++ would result in further encoding time reduction.

Research Question 2: Which fractal compression method suits best for medical images and gives best results?

The survey of the literature showed two fractal compression methods that are superior to other approaches. The irregular regions approach is unmatched at higher compression ratios and the Horizontal-Vertical approach gives best results at low compression ratios.

Because of the specific character of medical images (require very high fidelity) the

HV method was chosen - only the lower compression ratios can be used because the higher ones would result in too high distortion level.

However, in the thesis also a new approach is considered. It is based on the two algorithms mentioned above. The method based on irregular regions would construct the range regions by using the HV partitioning. In the opinion of the author, this combination could give better performance in the rate-distortion sense than any hierarchical or irregular regions-based approach.

Research Question 3: Do the fractal compression preserve image quality better or worse than other irreversible

(information lossy) compression methods?

According to objective measures, the elaborated fractal compression method turns out to be slightly weaker than the JPEG algorithm when the image is encoded with preservation of the fidelity required by medical images. Because the wavelet coding outperforms the JPEG, the fractal compression also gives worse results than the wavelet compression. The fractal compression gives better results (in rate-distortion sense) than

JPEG at compression ratios higher than 18 : 1. When compression ratio falls between

14 : 1 and 18 : 1 then the objective measures do not give unambiguous indication which method, fractal or JPEG, produces image closer to the original. Nevertheless,

2. The GNU Compiler for the Java Programming Language

Conclusions

103 some types of medical images can be compressed (the distortions are at satisfying level) at compression ratios where fractal compression gives better results – more in section

1.2.

The second approach to fractal compression – irregular regions based on HV partitioning, might give better rate-distortion results than the JPEG.

However, the fractal compression is inseparably bounded with the fractal magnification – a very useful feature that cannot found in any other compression method. If fractal compression is treated not only as a method to compress images but also as a method to improve presentation quality of the images then there is no other method that could be compared against fractal compression.

Research Question 4: Can the results of fractal magnification be better than the results of traditional magnification methods?

In author’s opinion, the performed experiments leave no doubt – the fractal magnification is superior to bilinear as well as bicubic interpolation. Although the image zoomed with fractal compression has visible block effect, the sharpness of the image is much higher and the details are better visible then in the images magnified through interpolation. The sharpness is one of the most important factors in medical images.

When the image is out of focus, some small but very important details, like a fracture of a bone in an X-ray image, may become invisible. In addition, measurements are much easier and reliable when the image is sharp enough because the edges of tissues

(or other orientation points) can be better localized. For example, measurement of intima-media thickness is used in detecting and monitoring aortic atherosclerosis.

The objective measurement (Peak Signal to Noise Ratio) also interchangeably points at the fractal magnification as the one that better magnifies medical images.

The measurement of magnification is performed by calculating PSNR between the original image and an image that was created by zooming out the magnified image to the original image’s size.

Future Work

The tests performed to utilize the accuracy of the proposed fractal compression method utilized only objective measures. These measures are very helpful in comparison of different compression methods but they do not reflect the usefulness of the evaluated compression to the persons who read the images. Because of this, it is highly advisable to perform experiments with human observers in order to evaluate the quality of the compressed images. Preferably, the tests should involve the specialists that in their day-to-day work establish diagnosis based on the medical images. The tests would make possible to adjust the proposed fractal compression method to Human Visual

System (HVS).

The proposed fractal compression algorithm was tested only on Doppler ultrasonography images. Nevertheless, types of medical images tolerate different distortions and

Conclusions

104 amount of lost information. In order to gain more general knowledge about the suitability of the method to medical imaging, also other types of medical images shall be subjected to the experiments.

The selection of the partitioning method was based on an assumption that the existing fractal compression methods based on irregular regions perform weaker than the Horizontal-Vertical partitioning at low compression ratios. The assumption cannot be confirmed due to lack of such comparison of the two methods in the literature. In the future, also the best method based on the irregular methods (utilizing quadtree partitioning in the split phase) shall be submitted to the tests that were made for the

HV-partitioning.

As it was repeatedly suggested, further improvement in the rate-distortion sense and the encoding time may be achieved by merging the best two fractal methods – based on irregular regions and based on HV-partitioning. The way to do this is to utilize the HV-partitioning as the first step (splitting phase) to create the irregular regions. The literature provides solutions how to merge the partitions to create the irregular regions, i.e. the neighboring blocks with similar variance and average pixel intensity may be united into a single irregular range block. This thesis provides the solution to the only problem that cannot be solved with help of the literature because such compression method was not considered by any researcher. The question is how to store the transformations’ descriptions in the fractal code and the proposed solution is

to use the format presented in this thesis (section 5.5.4). However, this new compression

method with irregular regions still needs to be implemented and tested.

Appendix A

Sample Images

All presented here images have been encoded with fractal method based on most-significant edge detection with Fisher’s mechanism for rectangle degradation prevention. Minimal range block size has been set to 2

×

2, domain offset to 2 and the search is not interrupted by first found codebook block that fulfills the tolerance criterion. The presented images were encoded with various tolerance criteria.

Images #1 – #5 are complete ultrasonograms; these images were reconstructed to original sizes. In order to fit the presented images into pages without rescaling, these images are rotated by 90 degrees. The images #6 and #7 are only parts of ultrasonograms – they were reconstructed to larger than original sizes.

Appendix A. Sample Images

106

Figure A.1. Original image #1.

Appendix A. Sample Images

107

Figure A.2. Reconstructed image #1:

error t

= 6,

P SN R

= 42

.

53

dB

,

CR

= 6

.

26.

Appendix A. Sample Images

108

Figure A.3. Differential image #1:

error t

= 6.

Appendix A. Sample Images

109

Figure A.4. Partitioning of image #1:

error t

= 6.

Appendix A. Sample Images

110

Figure A.5. Reconstructed image #1:

error t

= 11,

P SN R

= 40

.

02

dB

,

CR

= 8

.

92.

Appendix A. Sample Images

111

Figure A.6. Differential image #1:

error t

= 11.

Appendix A. Sample Images

112

Figure A.7. Partitioning of image #1:

error t

= 11.

Appendix A. Sample Images

113

Figure A.8. Reconstructed image #1:

error t

= 11,

P SN R

= 36

.

94

dB

,

CR

= 14

.

11.

Appendix A. Sample Images

114

Figure A.9. Differential image #1:

error t

= 22.

Appendix A. Sample Images

115

Figure A.10. Partitioning of image #1:

error t

= 22.

Appendix A. Sample Images

116

Figure A.11. Reconstructed image #1:

error t

= 100,

P SN R

= 30

.

01

dB

,

CR

= 52

.

22.

Appendix A. Sample Images

117

Figure A.12. Differential image #1:

error t

= 100.

Appendix A. Sample Images

118

Figure A.13. Partitioning of image #1:

error t

= 100.

Appendix A. Sample Images

119

Figure A.14. Original image #2.

Appendix A. Sample Images

120

Figure A.15. Reconstructed image #2:

error t

= 12,

P SN R

= 40

.

32

dB

,

CR

= 9

.

68.

Appendix A. Sample Images

121

Figure A.16. Differential image #2:

error t

= 12.

Appendix A. Sample Images

122

Figure A.17. Partitioning of image #2:

error t

= 12.

Appendix A. Sample Images

123

Figure A.18. Original image #3.

Appendix A. Sample Images

124

Figure A.19. Reconstructed image #3:

error t

= 10,

P SN R

= 40

.

43

dB

,

CR

= 6

.

69.

Appendix A. Sample Images

125

Figure A.20. Differential image #3:

error t

= 10.

Appendix A. Sample Images

126

Figure A.21. Partitioning of image #3:

error t

= 10.

Appendix A. Sample Images

127

Figure A.22. Original image #4.

Appendix A. Sample Images

128

Figure A.23. Reconstructed image #4:

error t

= 12,

P SN R

= 40

.

37

dB

,

CR

= 5

.

64.

Appendix A. Sample Images

129

Figure A.24. Differential image #4:

error t

= 12.

Appendix A. Sample Images

130

Figure A.25. Partitioning of image #4:

error t

= 12.

Appendix A. Sample Images

131

Figure A.26. Original image #5.

Appendix A. Sample Images

132

Figure A.27. Reconstructed image #5:

error t

= 10,

P SN R

= 40

.

12

dB

,

CR

= 5

.

36.

Appendix A. Sample Images

133

Figure A.28. Differential image #5:

error t

= 10.

Appendix A. Sample Images

134

Figure A.29. Partitioning of image #5:

error t

= 10.

Appendix A. Sample Images

Figure A.30. Original image #6, size: 192

times

192.

135

Figure A.31. Partitioning of image #6:

error t

= 1

Appendix A. Sample Images

136

Figure A.32. Magnified image #6, size: 384

times

384,

error t

= 1,

P SN R

= 41

.

36

dB

.

Appendix A. Sample Images

Figure A.33. Original image #7, size: 96

times

96.

137

Figure A.34. Partitioning of image #7:

error t

= 1

.

3

Appendix A. Sample Images

138

Figure A.35. Magnified image #7, size: 384

times

384,

error t

= 1

.

3,

P SN R

= 41

.

5

dB

.

Appendix B

Glossary

1

Fixed block – all pixels have the same value.

A

Attractor – fixed point of the operator

W

.

A

= lim

i

→∞

W

◦ i

(

f

0

)

avgHeight

(

R

)

Average height in pixels of all range blocks that constitute the fractal operator.

avgW idth

(

R

)

Average width in pixels of all range blocks that constitute the fractal operator.

b

Width and height of range blocks when uniform partitioning is used.

B i

Rectangle block of pixels. May denote a range block, codebook block or domain block.

B i

Number of pixels in the block

B i

.

Appendix B. Glossary

140

B comp

Number of bits in the compressed image (length in bits of the compressed representation).

B org

Number of bits in the original image (before compression).

bit d

Bit Depth – number of bits used to store a single pixel of the original image.

bs

Base of the logarithm in

I

(

u i

) formula. When

bs

= 2 then the unit of

I

(

u i

) is bit, when

bs

= 3 then the unit is trit, when

bs

=

e

(natural logarithm) then the unit is nat, and the last unit – Hartley is used when

bs

= 10.

BR

Bit Rate - the average number of bits in compressed representation of the data per a single element (symbol) in the original set of data.

BR

=

B comp

/B org

C

Virtual codebook – set of all spatially contracted domain blocks from

D

.

C

Length of the codebook C.

C

=

D

C i

Codebook block – spatially contracted domain block

D i

.

C i

Number of pixels in the codebook block

C i

.

C

R i

Subset of the codebook

C

with blocks that shall be compared with range block

R i

during the search for transformation.

Appendix B. Glossary

CF

Spatial contraction factor used in transformation

τ

C

.

CP

Compression Percentage.

CP

= (1

1

/CR

)

·

100%

141

CR

Compression Ratio – ability of the compression method to reduce the amount of disk space needed to store the data.

CR

=

B org

/B comp

D

Domain pool – set of all domain blocks utilized during encoding.

D

Length of the domain pool D.

D

=

C

D

(

X,

f

)

Average distortion between images

X

and f

.

D

(

X,

f

) =

E

n

d

(

X,

f

) o

=

X

x i

X

f

X,

e

(

x i

, x

e

i x i

)

· d

(

x i

, x i

)

D i

Domain block.

d

(

x i

, x

)

Distortion per symbol.

dc

Mean intensity of a block.

dc bottom

(

n

)

Average value of block’s pixels that lie in rows with indexes higher than

n

.

Appendix B. Glossary

142

dc lef t

(

m

)

Average value of block’s pixels that lie in columns with indexes not higher than

m

.

dc right

(

m

)

Average value of block’s pixels that lie in columns with higher than

m

.

dc top

(

n

)

Average value of block’s pixels that lie in rows with indexes not higher than

n

.

domainOf f set

Offset in pixels between two spatially closest domain blocks.

δ

(

X,

f

)

,

δ

(

X, A

)

Distance between two images / attractors.

E h

ED

(

n

)

Significance of the horizontal edge between column

n th

and (

n

+ 1)

th

.

E v

ED

(

m

)

Significance of the vertical edge between column

m th

and (

m

+ 1)

th

.

E h

V M

(

n

)

Sum of the intensity variances of two blocks that can be created by cutting the range block

R i

horizontally between

n th

and (

n

+ 1)

th

row.

E h

V M

(

n

) =

M

1

X

i

=0

n

X

(

r i,j j

=0

− dc top

(

n

))

2

+

M

1

X

i

=0

N

1

X

j

=

n

+1

(

r i,j

− dc bottom

(

n

))

2

E v

V M

(

m

)

Sum of the intensity variances of two blocks that can be created by cutting the range block

R i

vertically between

m th

and (

m

+ 1)

th

columns.

E v

V M

(

m

) =

m

X

N −

1

X

(

r i,j

− dc lef t

(

m

))

2

i

=0

j

=0

+

M −

1

X

i

=

m

+1

N −

1

X

(

r i,j

− dc right

(

m

))

2

j

=0

error t

Error threshold, distance tolerance – maximal allowed distance between paired range and domain blocks.

Appendix B. Glossary f

X

(

x i

)

Occurrence probability of a determined symbol

x i

in stream

X

.

143

f

e

|

X

(

x , x i

)

Conditional probability that given symbol that symbol

x i

will occur in source

X

.

x i

will occur in source

X

under condition

g

F isher

Fisher’s mechanism for rectangle degradation prevention [Fis95c].

g

F isher

(

x

) = min(

x, x max

− x x max g

F isher

Fisher’s mechanism of rectangle degradation prevention; adapted to block splitting method based on variance minimization.

g

F isher

(

g

(

x

)) =

x max

2

min(

x, x max

− x

)

x max g f lat

Simple mechanism for rectangle degradation prevention.

g f lat

(

x

) =

(

0

when

(

x < t

)

(

x max

1

otherwise

− x < t

)

g

Saupe

Saupe’s mechanism for rectangle degradation prevention [Sau98].

g

Saupe

(

x

) = 0

.

4

"

2

x max x

1

2

+ 1

#

g

Saupe

Saupe’s mechanism of rectangle degradation prevention; adapted to block splitting method based on most-significant edge detection.

g

Saupe

(

x

) = 0

.

8

0

.

4

"

2

x max x

1

2

+ 1

#

Appendix B. Glossary

H

(

U

)

Entropy – the amount of information specified by a stream

U

of symbols.

H

(

U

) =

U

X

p

(

u i

)

·

I

(

u i

) =

i

=1

U

X

p

(

u i

)

·

log

bs i

=1

1

p

(

u i

)

=

U

X

p

(

u i

)

·

log

bs i

=1

p

(

u i

)

144

h n

row

Numerical value that characterizes the potential range block’s cutting line between

n

and

n

+ 1. Values vertical cutting lines.

{ h n

: 0

¬ n < N

}

are used as the weights of all potential

height

(

B i

)

Number of pixels in a column of block

B i

.

I m

Mutual Information – the average information that random variables (

X

and f

) convey about each other.

I m

(

X,

f

) =

H

(

X

)

H

(

X

|

f

) =

H

( f

)

H

(

X

|

X

)

I

(

u i

)

Amount of information carried by the symbol

u i

. The unit depends on the

bs

value.

I

(

u i

) = log

bs

1

/p i

IF

Image Fidelity – an objective measure of distortions.

IF

= 1

M −

1

X

N −

1

X h

X

(

m, n

)

− m

=0

n

=0

˜

(

m, n

) i

2

/

M −

1

X

N −

1

X

[

X

(

m, n

)]

2

m

=0

n

=0

leaf N o

Number of leafs in Huffman tree. In the thesis

leaf N o

=

W levelN o

Number of levels in Huffman tree.

Appendix B. Glossary

M

Number of columns in the original image.

m

Index of a column. 0

¬ m < M

M AE

Mean Average Error – an objective measure of distortions.

M AE

=

1

M

·

N

M −

1

X

N −

1

X h

X

(

m, n

)

− m

=0

n

=0

˜

(

m, n

) i

max

X

Maximal possible pixel value of the image X.

max

X

= 2

bit d

1

M E

Mean Error – maximal difference between corresponding pixels of

X

and f

.

M SE

Mean Squared Error – an objective measure of distortions.

1

M SE

=

M

·

N

M −

1

X

N −

1

X h

X

(

m, n

)

f

(

m, n

) i

2

m

=0

n

=0

145

N

Number of rows in original image.

n

Index of a row. 0

¬ n < N o i

Offset coefficient of the intensity transform

τ

I

.

p i

Probability that the symbol

u i

will occur in the input sequence.

Appendix B. Glossary

146

P SN R

Peak Signal to Noise Ratio – an objective measure of distortions, expressed in decibels (dB).

P SN R

= 10

·

log

10

max

2

X

M SE

!

R

Set of all range blocks in the image.

R i

Range block.

R i

(

m, n

)

,

D i

(

m, n

)

,

C i

(

m, n

)

Pixel value in the

m th

column and

n th

row of the block

R i

,

D i

or

C i

.

R

IN

Inner node of the partitioning tree – a range block divided horizontally or vertically.

Its representation in fractal code requires

L

bits.

r m,n

Pixel value in the

m th

column and

n th

row of the range block

R i

.

s

Contractivity factor of fractal operator

W

. 0

< s <

1

s i

Scaling coefficient of the intensity transform

τ

I i size t

Range size threshold – minimal size of the range blocks, set before encoding.

σ

(

B i

)

Variance of the codebook

B i

.

τ i

(

D i

)

Transformation putted on the pixels of a

D i

of

R i

. Part of transformation

w i

.

τ i

=

τ i

I

τ

S e

τ i

I

before they will be mapped to pixels

Appendix B. Glossary

147

τ i

I

(

B i

)

Intensity transformation of the pixels within domain block

B i

.

τ i

I

(

B i

) =

s i

·

B i

+

o i

·

1

τ

C

(

B i

)

Spatial contraction of a domain block.

τ

S e

(

B i

)

Symmetry operation with index

e

applied to the domain block

B i

.

U

Sequence of symbols.

U

=

u

1

, u

2

, . . . , u

U u i

Symbol in the input sequence

U

on the

i th

position.

v m

Numerical value that characterizes the potential range block’s cutting line between column

m

and

m

+ 1. Values vertical cutting lines.

{ v m

: 0

¬ m < M

}

are used as the weights of all potential

W

Fractal Operator.

W

=

W

[

w i i

W

Number of transformations in the fractal operator W.

w i

A contractive affine transformation that is part of the fractal operator

W

.

width

(

B i

)

Number of pixels in a row of block

B i

.

Appendix B. Glossary

148

X

Stream to be encoded with lossy compression method. The original image.

X

=

x

1

, x

2

, . . . , x

X

Magnified image.

Image ˙ demagnified to the size of the original image

X

.

x

Stream reconstructed from lossy compression. The reconstructed image.

f

=

, x , . . . , x

X

(

m, n

)

Intensity value of the pixel in the

m th

column and

n th

row of the image

X

.

X

0

Initial image submitted to fractal decoding.

X r

Image produced in iteration r of fractal decoding.

f r

=

W

(

f r −

1

)

x i

Symbol on the

i th

pression.

position in the input stream

X

that is encoded with lossy com-

x

Symbol on the

i th

position in the reconstructed stream f from lossy compression.

x max

Maximal possible value of the variable

x

.

Appendix C

Application Description and

Instructions for Use

The WoWa Fractal Coder software was created in order to test practically the suitability of the fractal compression methods to medical imaging.

C.1. How to Run the Program

WoWa Fractal Coder requires Java 1.6 (also referred to as Java 6.0). The JRE

(Java Runtime Environment) version of Java is enough to run the program. The newest version of JRE can be downloaded from Sun’s webpage (http://www.sun.com).

When Java 1.6 is installed, the zip-file with the application has to be extracted into an empty folder. In this folder, an executable jar and a folder named

lib

shall appear.

The application can be run by double-clicking the wowaCoder.jar

file or by typing the following in the command line: java -jar "wowaCoder.jar"

Because the application is very memory consuming, it is advised to increase to Java

Heap Space. It can be done by specifying the parameter

-Xmx

, e.g.: java -jar -Xmx250m wowaCoder.jar

C.2. Common Interface Elements and Functionalities

The software provides two fundamental views: Wowa Fractal Encoder and WoWa

Fractal Decoder, the user can switch between them through menu

View

in the menu bar. Both the encoder and the decoder window have almost the same menu bar, tool bar and the status bar.

C.2.1. Menu Bar

In the menu bar, following menus can be found:

Appendix C. Application Description and Instructions for Use

150

(a) The WoWa Fractal Encoder window.

(b) The WoWa Fractal Decoder window.

Figure C.1. The WoWa Fractal Coder user interface.

Appendix C. Application Description and Instructions for Use

151

File

Load – this menu item displays an open-file dialog, in the encoder window the user picks the image to be encoded and in the decoder window the file with the fractal operator describing an encoded image

Save – the content of the log of the currently visible window can be saved as well as the result of the last encoding (encoder window) or decoding (decoder) process

Close – exit the application

View – makes possible to switch between the encoder and the decoder view, what can be also obtained by pressing F2 key. In the encoder window there is one more menu item which displays the Image Comparator window.

Options – allows to access to dialog windows that allow to change encoding and decoding settings as well as application preferences

C.2.2. Tool Bar

The tool bars have three common buttons:

Load – has same effect as the Load item in the File menu

Start – begins the processing proper to currently visible window: encoding or decoding

Stop – if the encoding/decoding process is currently running than this button replaces the Start button and allows the user to interrupt this process

Besides these three buttons, both the encoder and the decoder have three more buttons. In the encoder there are:

Save button which allows to save the lastly produced fractal operator

a toggle button that allows to show/hide the selected area in the Original tab

a button which sets the selection to default (whole original image)

The decoder in these places has following buttons:

a button that allows to load the lastly generated fractal operator directly from the encoder instead loading it from file

a button that gives the user possibility to set an initial image that will be passed to the decoder algorithm. First iteration will take the domain blocks from the initial image, also the reconstructed image size and proportions are same as in the initial image. If there is no initial image loaded, than a plain black image becomes the initial image. The size of an initial image created in this way is equal to the original size of the encoded image multiplied by the magnification factor (decoding settings). The proportions in this case are same as in the original image.

a button that clears the decoder window – the images from the Decompressed tab and the Initial tab are erased and all tabs created during the encoding are deleted.

C.2.3. Status Bar

The status bars in the encoder and decoder windows have three same elements:

the path of the loaded file – the image to be encoded or the file with fractal operator describing encoded image

the description of current program status – if the encoding or decoding is currently executed than proper information shall appear in the status bar

Appendix C. Application Description and Instructions for Use

152

the progress bar that shows the progress of the process indicated by the label on the left to this progress bar (see the point above)

In the encoder, between the file path and the current process name, the location and size of the selected area of the loaded image can be found. In this place in the decoder window the original size of the image that is encoded into loaded fractal operator and the size to which the image is decompressed is showed.

C.2.4. Pop-up Menus

Any images that are presented in the application can be zoomed in, zoomed out or saved to a file. All these actions can be accessed through a pop-up menu that is displayed after right-clicking on an image. When the cursor is over the pop-up menu than an information about current zoom of the considered image is displayed in the top left corner of the image.

Right clicking on the content of the Log tabs results in displaying a pop-up menu with only one item, which allows to save the log to a

*.txt

file.

C.3. WoWa Fractal Encoder

The main part of the Encoder window is a occupied by a tabbed panel. There are several tabs:

“Original” – here the original image is displayed and the area of the image that shall be compressed can be defined.

“Reconstructed” – after encoding, there the image reconstructed from the fractal code created during the encoding process can be found (only if the

Reconstruct the image after encoding

option is enabled in the application preferences).

“Comparison” – the tab contains the selected area of the original image converted to grayscale color model, the reconstructed image and the differential image. There are also presented histograms of pixel intensities in the original image and in the reconstructed image.

“Partitions” – the tab contains two images that reflect the partitioning of the original image into range blocks.

“Transformations” – there can be found an interactive image based on the original image, which allows to track which domains are mapped to which range blocks and how the domain blocks are transformed.

“Log” – the actions that were taken up by the user are written to the log with attached timestamp. Also the summary of the encoding process is presented there.

Some tabs, which have more complex content, are described in more details in following subsections.

C.3.1. Original Tab

In this tab not only the user can familiarize himself with the image loaded in order to be encoded but also limit the area of this image that shall be passed to the encoder.

The are can be delimited in two ways:

Appendix C. Application Description and Instructions for Use

153

by left-clicking the upper left corner of the selection is pointed, through holding the left mouse button and moving the mouse cursor the selection size is determined.

This method can be used when the selection is not visible.

when the selection is shown, by adjusting the size and location through moving the displayed selection borders or drag-and-drop performed on the selection rectangle.

The selection rectangle can be displayed/hid by clicking a toggle button

show/hide selection

in the tool bar. The utmost right button in the tool bar allows to set the selection to whole loaded image, what is also the default selection.

C.3.2. Comparison Tab

In this tab several images are displayed, their number depends on the application settings.

The selected area of the original image, which was passed to the fractal encoder, is always displayed. If the original image is in color than the selected area is converted to grayscale through calculating the luminance of each pixel with function

f

(

x, y

) =

0

.

3

R

+ 0

.

59

G

+ 0

.

11

B

(where

R

,

G

,

B

are the red, green and blue components of the original pixel value), which is used in the YUV and L*a*b color models.

If the encoded image is automatically reconstructed then also the reconstructed image is displayed. The image is decoded in accordance with the settings which are used also by the WoWa Fractal Decoder window. One of these settings is the zoom factor.

If the reconstructed image is zoomed in or zoomed out than next to the reconstructed image another image is displayed. This image is created by encoding the reconstructed image with the same algorithm and settings and decoding to the size of the selected area in the original image.

There can be also visible a differential image (only if it is enabled in the application preferences). If the zoom parameter is not equal to one than the differential image reflects the differences between this second reconstructed image and the selected area of the original image.

Both histograms in this tab, but also the distance histogram in the Transformations tab, are presented in bar charts. The user can display an exact value of the histogram by left-clicking on the chart. Double-left-click on the chart erases all earlier-demanded massages about histogram values from the chart.

C.3.3. Partitions Tab

The partitioning of the image into the ranges is presented in two images. The image on the right shows a net created from the boundaries of all range blocks and displays it with the original image in the background. The image on the left does not show the original image but some additional information about the range blocks and the search process is displayed there. The range blocks filled with blue color were paired with codebook blocks provided by the solid codebook. The yellow color points the range blocks that were not paired with any codebook block - their are approximated with a uniform block.

Appendix C. Application Description and Instructions for Use

154

C.3.4. Transformations Tab

How it was already mentioned, here can be found an interactive image based on the original image. When the cursor is over any pixel of the image, than a yellow rectangle is drawn around the range block to which the pixel belongs. Another yellow rectangle, but darker, surrounds the domain block mapped to the range block. After left-clicking on the image the currently displayed range and domains blocks are marked permanently

(the color of the rectangles is changed to green). When the mouse exits the image, the transformation represented by green rectangles becomes the selected one, what makes possible to zoom in and take a closer look at the images in the bottom of the tab.

The bottom part of the tab is occupied with a chain of images that describe currently selected transformation starting from the domain block, through the codebook block and finishing with an intensity transformed codebook block. There are two more images in this chain: the content of the range block putted on the original image (with which the intensity transformed codebook block is compared during the search for matching domain and range blocks) and the content of the range block putted on the reconstructed image.

Next to the interactive original image a histogram can be found that shows the frequency of occurrences of spatial distances between paired range and domain blocks.

The distances between upper left corners of these corners are taken here into consideration. This histogram can be very helpful in determining whether spiral search may have any positive effect on the encoding time.

C.3.5. Log Tab

In the log, besides annotating the events like, starting the encoder window, loading image or starting/finishing encoding also summaries of the encoding process are presented. Such summary provides the user with following information:

description of the original image:

the file path

the size of the loaded image

the position of the upper left vertex of the image area selected to be encoded

the size of the selected area

in-force decoding settings

the zoom factor

number of decoding iterations

what decoding method was used (single-time quantization vs. standard decoding)

in-force encoding settings

range size threshold

contraction factor of the contraction transformation

offset between neighboring domains in the domain pool

used method to split range blocks into two

maximal error tolerated during comparing codebook block with range block

time consumption

duration of the encoding

duration of the decoding

Appendix C. Application Description and Instructions for Use

155

number of search threads active during encoding (influences only the encoding time)

characteristics of the fractal operator

number of transformations

average, largest and smallest range block spatial sizes

average spatial distance between range blocks and domain blocks

quality of the recovered image expressed by calculating distance measures

image fidelity

average difference

maximal difference

mean squared error

peak mean squared error

signal to noise ratio

peak signal to noise ratio

C.3.6. Image Comparator

The window allows to calculate the error between any two images with same measures and algorithms that are used to assess after encoding the quality of reconstructed image. The interface is very simple – there are two buttons in the top, which allow to load the two images that shall be compared. When two images are loaded, then a summary appears in the bottom of the window. The Image Comparator is used when there is need to asses the quality of other compression or magnification methods than those implemented in the WoWa Fractal Coder.

C.4. WoWa Fractal Decoder

Also in the decoder window, a tabbed panel takes the central place. Initially there are three tabs:

“Decompressed” – after decoding, it contains the image reconstructed from the fractal code loaded earlier

“Log” – all user actions are denoted and marked with a timestamp

“Initial” – shows the image (if set) that will be the starting point of the decoding algorithm – it is used by the first iteration at the decoded image

During decoding, tabs named with successive integer numbers starting from 0 are created – the products of successive decoding iterations are presented in these tabs. The number of this tabs is equal to the number of iterations set in decoding settings. The product of each iteration becomes the base image for the next iteration – all domain blocks are take from this image.

C.5. Settings

The user can define multiple parameters that have an influence on the way the application works and on the encoding and decoding algorithms.

Appendix C. Application Description and Instructions for Use

156

C.5.1. Application Preferences

The

Application Preferences

define the behavior of different elements of the application:

Zoom step – how many times an image shall be zoomed in / zoomed out after pressing the zoom in / zoom out buttons in the pop-up menu of the image

Minimal size of the image selection – too small size of the selected area of the original image (in the encoder window) may cause problem with resizing/relocating the selection

check box that allows the user to decide whether the image shall be automatically decoded after encoding. If the option is unchecked than the quality of any image reconstructed from code created by the software can be done only through the

Image Comparator.

check box that indicates if the differential image shall be created after automatic decoding (see above). The pixel values may be equal to the difference between corresponding pixel values in the original and reconstructed images or this difference may be multiplied by a factor set by user.

the user can choose between two file formats (see section 5.5). If the new approach

will be chosen than the cell coordinates can be stored with or without use of

Huffman coding.

C.5.2. Encoder Settings

The encoding algorithm widely depends on the settings made by the user. The parameters that can be defined by the user are:

the range block size threshold

the contraction factor of the contraction transformation

the offset between neighboring domain blocks in the domain pool – it cannot be smaller than the contraction factor

what mechanism shall be used to split the range blocks into two during encoding.

There are two mechanisms possible: edge detection and variance minimization (see

chapter 5.2.

what rectangle degradation prevention mechanism shall be used during splitting each range block. There are three options: Fisher’s function, Saupe’s function and flat function that does not have any influence on the splitting mechanism besides eliminating the cutting lines that produce range block smaller that range block size threshold.

the distance tolerance criterion – what is the maximal allowed error between the paired range block and codebook block. The value from the text block is interpreted as the square of the maximal allowed root mean squared error.

whether the symmetry operations shall be used during the encoding

whether the search shall be interrupted after finding the first codebook block that brings about error smaller than the tolerance criterion for given range block or always the best match shall be found

whether the codebook shall be determined only on the fly (during performing the search) or the part of the codebook with the smallest codebook blocks shall be fixed in a preliminary phase (this part is called the “solid codebook”). During

Appendix C. Application Description and Instructions for Use

157 building the solid codebook, the codebook blocks’ inner products are calculated

h

C j

,

1

i

,

h

C j

, C j i

, as well as the average pixel intensity within the bock and the variance of the pixel intensity (if necessary, see next bullet) – this computation is performed only once when the codebook block finds in the solid codebook, and it has to be performed each time a codebook is considered when the block is provided by the “on the fly” codebook. The user can set and the maximal size of codebook blocks that will be added to this codebook. Any codebook block that has both sides not longer than this value will be looked after only in the solid codebook, the solid codebook lasts in the memory over the whole encoding process, while “on the fly” codebook is only a piece of functionality and no data structures has to be stored.

a variance-based acceleration technique can be activated for the encoding process.

If this option is turned on, additional information about the blocks is required by the encoding algorithm – the variance of pixel values. The user has two possibilities to use this information in order to speed up the encoding.

setting the variance of the shade blocks – all range blocks with variance not larger than the value given by the user will be treated as shade blocks, i.e.

there will be not performed the search process after matching codebook block but the range block will be approximated with a fixed block

setting the maximal allowed distance between codebook block and range block in terms of the variance value – during the search performed to find the codebook block for a given range block, only these codebook blocks will be considered, that have variance not larger than the user given value over the range block’s variance and not smaller than the range block’s variance. All other codebook blocks are omitted during the search. This acceleration mechanism can be applied to whole codebook or only to one part of it (the solid codebook or the “on the fly” codebook).

how many threads should perform the search for matching range and codebook blocks during encoding

C.5.3. Decoder Settings

Most of the decoder settings are used not only during decoding started in the decoder window but also in decoding performed automatically after encoding. The following options can be set:

number of decoding iterations

whether the products of successive decoding iterations shall be displayed in tabs of the decoder window

whether the decoding should use matrices of double numbers to store the products of the iterations (decoding with single-time quantization) or matrices of integers

(standard approach).

whether the reconstructed image should be zoomed in or zoomed out comparing to the original image size and how many times.

Appendix D

Source Code and Executable Files

The source code and executable files are attached to the thesis on a CD disk (hard copy of the thesis) or they are packed to the same zip file with the pdf document

(electronic version).

The source code of the WoWa Fractal Coder is compressed to a zip file and placed in folder

src

and the executables (also compressed to a zip file) are in folder

app

.

The archive with executable files contains also several sample medical images.

List of Figures

1.1

Examples of x-ray images.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.2

Example of a Computerized Tomography image (from [Gon02]) . . . . . . . . . .

8

1.3

Examples of gamma-ray images (from [Gon02]) . . . . . . . . . . . . . . . . . . .

9

1.4

Examples of magnetic resonance images (from

normartmark.blox.pl

, 21.09.2007).

10

1.5

Example of USG image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.1

General scheme for lossy compression. . . . . . . . . . . . . . . . . . . . . . . . .

16

2.2

Regular scalar quantization.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.3

Compression system model in rate-distortion theory . . . . . . . . . . . . . . . .

18

2.4

The relationship between bit rate and distortion in lossy compression. . . . . . .

19

2.5

Selfsimilarity in real images (from

einstein.informatik.uni-oldenburg.de

/rechnernetze/fraktal.htm

, 19.01.2008).

. . . . . . . . . . . . . . . . . . . . . . .

20

2.6

Generation of the Sierpinski triangle. Four first iterations and the attractor.

. .

22

2.7

Barnsley fern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.8

Fractal magnifier block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.9

Fractal magnification process . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

3.1

Quadtree partitioning of a range. Four iterations.

. . . . . . . . . . . . . . . . .

30

3.2

Horizontal-vertical partitioning of a range. Four iterations. . . . . . . . . . . . .

32

3.3

Region edge maps and context modeling . . . . . . . . . . . . . . . . . . . . . . .

34

3.4

Performance of compression methods with irregular partitions. . . . . . . . . . .

35

3.5

Probability density function of block offset. . . . . . . . . . . . . . . . . . . . . .

37

3.6

Spiral search (from [Bar94b]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

3.7

Distributions of scaling and offset coefficients (second order polynomial intensity transformation) (from [Zha98]) . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.1

Comparison of fractal compression methods based on irregular-shaped regions and HV partitioning.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

5.2

Hierarchical fractal encoding. Block diagram. . . . . . . . . . . . . . . . . . . . .

58

5.3

Isometries for rectangle block.

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

5.4

The structure of the fractal code describing a single transformation depending on the position of the range block with respect to the position of the underlying cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

List of Figures

160

5.5

Grid putted on an image. The currently processed cell and the neighboring cells are marked with triangles with labels. The order of traversing the grid is showed by the arrows in beckground. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

5.6

Example of Huffman tree used for constructing binary representation of fractal operator. Nodes marked with weights and indexes in the ordered list. . . . . . .

72

6.1

An example medical image and its fractally encoded version. . . . . . . . . . . .

83

6.2

The proposed compression algorithm performance depending on the block splitting mechanism and rectangle degradation prevention mechanism.

ED

denotes the splitting mechanism based on edge detection and

VM

the mechanism based on variance minimization. . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

6.3

Test of the bit allocation for the intensity transformation coefficients. . . . . . .

86

6.4

Test of the bit allocation for the intensity transformation coefficients. . . . . . .

88

6.5

Influence of the codebook size on the PSNR and the encoding time. . . . . . . .

89

6.6

Best match searching versus first match searching. . . . . . . . . . . . . . . . . .

90

6.7

Performance of different codebook types. . . . . . . . . . . . . . . . . . . . . . .

91

6.8

Influence of the classification to ”shade” and ”oridinary” blocks on the encoding time, compression ratio and PSNR.

. . . . . . . . . . . . . . . . . . . . . . . . .

92

6.9

Performance of the codebook block elimination based on the blocks’ variance difference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

6.10 Histogram of spatial distances between matching codebook blocks and range blocks.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

6.11 Comparison of the proposed fractal compression method and JPEG according to different objective measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

6.12 Comparison of fractal magnification with interpolation methods for different magnification ratios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

6.13 Fractal magnification and interpolation methods for different image sizes. . . . .

98

6.14 Comparison of the proposed fractal compression/magnification method and interpolation methods according to different objective measures. . . . . . . . . .

99

6.15 Fractal magnification and bicubic interpolation examples. . . . . . . . . . . . . .

100

A.1

Original image #1.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

106

A.2

Reconstructed image #1:

error t

= 6,

P SN R

= 42

.

53

dB

,

CR

= 6

.

26. . . . . . . .

107

A.3

Differential image #1:

error t

= 6. . . . . . . . . . . . . . . . . . . . . . . . . . .

108

A.4

Partitioning of image #1:

error t

= 6. . . . . . . . . . . . . . . . . . . . . . . . .

109

A.5

Reconstructed image #1:

error t

= 11,

P SN R

= 40

.

02

dB

,

CR

= 8

.

92. . . . . . .

110

A.6

Differential image #1:

error t

= 11. . . . . . . . . . . . . . . . . . . . . . . . . . .

111

A.7

Partitioning of image #1:

error t

= 11. . . . . . . . . . . . . . . . . . . . . . . . .

112

A.8

Reconstructed image #1:

error t

= 11,

P SN R

= 36

.

94

dB

,

CR

= 14

.

11. . . . . .

113

A.9

Differential image #1:

error t

= 22. . . . . . . . . . . . . . . . . . . . . . . . . . .

114

A.10 Partitioning of image #1:

error t

= 22. . . . . . . . . . . . . . . . . . . . . . . . .

115

A.11 Reconstructed image #1:

error t

= 100,

P SN R

= 30

.

01

dB

,

CR

= 52

.

22. . . . . .

116

A.12 Differential image #1:

error t

= 100. . . . . . . . . . . . . . . . . . . . . . . . . .

117

A.13 Partitioning of image #1:

error t

= 100. . . . . . . . . . . . . . . . . . . . . . . .

118

A.14 Original image #2.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

119

A.15 Reconstructed image #2:

error t

= 12,

P SN R

= 40

.

32

dB

,

CR

= 9

.

68. . . . . . .

120

A.16 Differential image #2:

error t

= 12. . . . . . . . . . . . . . . . . . . . . . . . . . .

121

A.17 Partitioning of image #2:

error t

= 12. . . . . . . . . . . . . . . . . . . . . . . . .

122

A.18 Original image #3.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

123

A.19 Reconstructed image #3:

error t

= 10,

P SN R

= 40

.

43

dB

,

CR

= 6

.

69. . . . . . .

124

List of Figures

161

A.20 Differential image #3:

error t

= 10. . . . . . . . . . . . . . . . . . . . . . . . . . .

125

A.21 Partitioning of image #3:

error t

= 10. . . . . . . . . . . . . . . . . . . . . . . . .

126

A.22 Original image #4.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127

A.23 Reconstructed image #4:

error t

= 12,

P SN R

= 40

.

37

dB

,

CR

= 5

.

64. . . . . . .

128

A.24 Differential image #4:

error t

= 12. . . . . . . . . . . . . . . . . . . . . . . . . . .

129

A.25 Partitioning of image #4:

error t

= 12. . . . . . . . . . . . . . . . . . . . . . . . .

130

A.26 Original image #5.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

131

A.27 Reconstructed image #5:

error t

= 10,

P SN R

= 40

.

12

dB

,

CR

= 5

.

36. . . . . . .

132

A.28 Differential image #5:

error t

= 10. . . . . . . . . . . . . . . . . . . . . . . . . . .

A.29 Partitioning of image #5:

error t

= 10. . . . . . . . . . . . . . . . . . . . . . . . .

133

134

A.30 Original image #6, size: 192

times

192.

. . . . . . . . . . . . . . . . . . . . . . .

135

A.31 Partitioning of image #6:

error t

= 1

. . . . . . . . . . . . . . . . . . . . . . . .

135

A.32 Magnified image #6, size: 384

times

384,

error t

= 1,

P SN R

= 41

.

36

dB

.

. . . .

136

A.33 Original image #7, size: 96

times

96. . . . . . . . . . . . . . . . . . . . . . . . . .

137

A.34 Partitioning of image #7:

error t

= 1

.

3

. . . . . . . . . . . . . . . . . . . . . . .

137

A.35 Magnified image #7, size: 384

times

384,

error t

= 1

.

3,

P SN R

= 41

.

5

dB

.

. . . .

138

C.1

The WoWa Fractal Coder user interface.

. . . . . . . . . . . . . . . . . . . . . .

150

Bibliography

[AL99]

Anderson-Lemieux

,

A.

,

Knoll

,

E.

Digital image resolution: what it means and how it can work for you

. Professional Communication Conference, 1999.

IPCC 99. Communication Jazz: Improvising the New International Communication Culture. Proceedings. 1999 IEEE International, pp. 231–236, 1999.

[Ary93]

Arya

,

S.

,

Mount

,

D. M.

Algorithms for fast vector quantization

. W Proceedings of the IEEE Data Compression Conference DCC’93, (J. A. Storer, M. Cohn, eds.), pp. 381–390, Snowbird, UT, USA, 1993.

[Bah93]

Baharav

,

Z.

,

Malah

,

D.

,

Karnin

,

E.

Hierarchical interpretation of fractal image coding and its applications to fast decoding

. W Intl. Conf. on Digital Signal

Processing, Cyprus, 1993.

[Bah95]

Baharav

,

Z.

,

Malah

,

D.

,

Karnin

,

E.

Hierarchical interpretation of fractal image coding and its applications

. W Fisher [Fis95a].

[Bar93]

Barthel

,

K. U.

,

Voy´

,

T.

,

Noll

,

P.

Improved fractal image coding

. W Proceedings of the International Picture Coding Symposium PCS’93, p. 1.5, Lausanne,

1993.

[Bar94a]

Barthel

,

K. U.

,

Sch¨

,

J.

, e

,

T.

,

Noll

,

P.

A new image coding technique unifying fractal and transform coding

. W Proceedings of the IEEE International Conference on Image Processing ICIP-94, vol. III, pp. 112–116, Austin,

Texas, 1994.

[Bar94b]

Barthel

,

K. U.

,

Voy´

,

T.

Adaptive fractal image coding in the frequency domain

. W Proceedings of the International Workshop on Image Processing, vol.

XLV, pp. 33–38, Budapest, Hungary, 1994.

[BE95]

Bani-Eqbal

,

B.

Speeding up fractal image compression

. W Proceedings of the

IS&T/SPIE 1995 Symposium on Electronic Imaging: Science & Technology, vol.

2418: Still-Image Compression, pp. 67–74, 1995.

[Bea90]

Beaumont

,

J. M.

Advances in block based fractal coding of still pictures

. W

Proceedings IEE Colloquium: The application of fractal techniques in image processing, pp. 3.1–3.6, 1990.

[Bed92]

Bedford

,

T.

,

Dekking

,

F. M.

,

Keane

,

M. S.

Fractal image coding techniques

Bibliography

163

[Bel02]

and contraction operators

. Nieuw Archief voor Wiskunde (Groningen), vol. 10(3), pp. 185–218, 1992. ISSN 0028–9825.

Belloulata

,

K.

,

Konrad

,

J.

Fractal image compression with region-based functionality

. IEEE Transactions on Image Processing, vol. 11(4), pp. 351–362, 2002.

[Bog92]

Bogdan

,

A.

,

Meadows

,

H. E.

Kohonen neural network for image coding based on iteration transformation theory

. W Proceedings of SPIE Neural and Stochastic

Methods in Image and Signal Processing, vol. 1766, pp. 425–436, 1992.

[Bos95]

Boss

,

R. D.

,

Jacobs

,

E. W.

Archetype classification in an iterated transformation image compression algorithm

. W Fisher [Fis95a], pp. 79–90.

[Bre98]

Breazu

,

M.

,

Toderean

,

G.

Region- based fractal image compression using deterministic search

. W Proc. ICIP-98 IEEE International Conference on Image

Processing, Chicago, 1998.

[Cas95]

Caso

,

G.

,

Obrador

,

P.

,

Kuo

,

C.-C. J.

Fast methods for fractal image encoding

.

W Proceedings of IS&T/SPIE Visual Communications and Image Processing 1995

Symposium on Electronic Imaging: Science & Technology, (L. T. Wu, ed.), vol.

2501, pp. 583–594, Taipei, Taiwan, 1995.

[CH04]

C. He

,

X. H., S. X. Yang

.

Variance-based accelerating scheme for fractal image encoding

. Electronics Letters, vol. 40(2), pp. 115–116, 2004.

[Cha97]

Chang

,

Y.-C.

,

Shyu

,

B.-K.

,

Wang

,

J.-S.

Region-based fractal image compression with quadtree segmentation

. W ICASSP ’97: Proceedings of the 1997 IEEE

International Conference on Acoustics, Speech, and Signal Processing (ICASSP

’97) -Volume 4, p. 3125, IEEE Computer Society, Washington, DC, USA, 1997.

ISBN 0-8186-7919-0.

[Cha00]

Chang

,

Y.-C.

,

Shyu

,

B.-K.

,

Wang

,

J.-S.

Region-based fractal compression for still image

. W In Proc. WSCG’20000, The 8-th International Conference in Central Europe on Computer Graphics, Visualization and Interactive Digital

Media, Plzen, Czech Republic, 2000.

[Che98]

Chen

,

C.-C.

On the selection of image compression algorithms

. icpr, vol. 02, pp.

1500 – 1504, 1998. ISSN 1051-4651.

[Dal92]

Dallwitz

,

M. J.

An introduction to computer images

. TDWG Newsletter, vol. 7, pp. 1–3, 1992.

[Dav94]

Davoine

,

F.

,

Chassery

,

J.-M.

Adaptive delaunay triangulation for attractor image coding

. W Proc. of 12th International Conference on Pattern Recognition

(ICPR), pp. 801–803, Jerusalem, 1994.

[Dav95]

Davoine

,

F.

,

Svensson

,

J.

,

Chassery

,

J.-M.

A mixed triangular and quadrilateral partition for fractal image coding

. W Proceedings of IEEE International

Conference on Image Processing ICIP-95, vol. III, pp. 284–287, IEEE Computer

Society, Washington, D.C, 1995. ISBN 0-8186-7310-9.

[Dav96]

Davoine

,

F.

,

Antonini

,

M.

,

Chassery

,

J.-M.

,

Barlaud

,

M.

Fractal image compression based on delaunay triangulation and vector quantization

. IEEE

Transactions on Image Processing, vol. 5(2), pp. 338–346, 1996.

[Dek95a]

Dekking

,

F. M.

Fractal image coding: some mathematical remarks on its limits

Bibliography

164

and its prospects

. Raport tech. DUT-TWI-95-95, Faculty of Technical Mathematics and Informatics, Delft Univeristy of Technology, Delft, The Netherlands,

1995.

[Dek95b]

Dekking

,

F. M.

An inequality for pairs of martingales and its applications to fractal image coding

. Raport tech. DUT-TWI-95-10, Faculty of Technical Mathematics and Informatics, Delft Univeristy of Technology, 1995.

[Deo03]

Deorowicz

,

S.

Universal lossless data compression algorithms’

, 2003. Doctor of

Philosophy Dissertation, Silesian University of Technology.

[Duf92]

Dufaux

,

F.

,

Kunt

,

M.

Multigrid block matching motion estimation with an adaptive local mesh refinement

. W Proceedings of SPIE Visual Communications and Image Processing 1992„

Lecture Notes in Computer Science

, vol. 1818, pp.

97–109, IEEE, 1992.

[Fis92a]

Fisher

,

Y.

Fractal image compression

. Raport tech. 12, Department of Mathematics, Technion Israel Institute of Technology, 1992. SIGGRAPH ‘92 COURSE

NOTES.

[Fis92b]

Fisher

,

Y.

Fractal image compression

. W Chaos and fractals: new frontiers of

1992.

[Fis92c]

Fisher

,

Y.

,

Jacobs

,

E. W.

,

Boss

,

R. D.

Fractal image compression using iterated transforms

. W Image and text compression, (J. A. Storer, ed.), roz. 2, pp.

35–61. Kluwer Academic Publishers Group, Norwell, MA, USA, and Dordrecht,

The Netherlands, 1992. ISBN 0-7923-9243-4.

[Fis94]

Fisher

,

Y.

,

Shen

,

T. P.

,

Rogovin

,

D.

A comparison of fractal methods with discrete cosine transform (DCT) and wavelets

. Proceedings of the SPIE — The

International Society for Optical Engineering, vol. 2304-16, pp. 132–143, 1994.

ISSN 0277-786X.

[Fis95a]

Fisher

,

Y.

, ed..

Fractal image compression: theory and application

. Springer

Verlag, Berlin, Germany / Heidelberg, Germany / London, UK / etc., 1995. ISBN

0-387-94211-4.

[Fis95b]

Fisher

,

Y.

Fractal image compression with quadtrees

. W Fractal image compres-

sion: theory and application [Fis95a], pp. 55–77.

[Fis95c]

Fisher

,

Y.

,

Menlove

,

S.

Fractal encoding with hv partitions

. W Fisher [Fis95a],

pp. 119–126.

[Fri94]

[Fri95]

Frigaard

,

C.

,

Gade

,

J.

,

Hemmingsen

,

T.

,

Sand

,

T.

Image compression based on a fractal theory

. Internal Report S701, Institute for Electronic Systems, Aalborg University, Aalborg, Denmark, 1994.

Frigaard

,

C.

Fast fractal 2D/3D image compression

. Manuscript, Institute for

Electronic Systems, Aalborg University, 1995.

[GA93]

Gharavi-Alkhansari

,

M.

,

Huang

,

T. S.

A fractal-based image block-coding algorithm

. W Proceedings of ICASSP-1993 IEEE International Conference on

Acoustics, Speech and Signal Processing, vol. 5, pp. 345–348, 1993.

[GA94a]

Gharavi-Alkhansari

,

M.

,

Huang

,

T. S.

Fractal based techniques for a gener-

Bibliography

165

alized image coding method

. W Proc. ICIP-94 IEEE International Conference on

Image Processing, vol. III, pp. 122–126, Austin, Texas, 1994.

[GA94b]

Gharavi-Alkhansari

,

M.

,

Huang

,

T. S.

Generalized image coding using fractal-based methods

. W Proceedings of the International Picture Coding Symposium PCS’94, pp. 440–443, Sacramento, California, 1994.

[GA96]

Gharavi-Alkhansari

,

M.

,

Huang

,

T. S.

Fractal image coding using rate-distortion optimized matching pursuit

. W Proceedings of SPIE Visual Communications and Image Processing 1996, (R. Ansari, M. J. Smith, eds.), vol. 2727, pp. 1386–1393, Orlando, Florida, 1996.

[Gon02]

Gonzalez

,

R. C.

,

Woods

,

R. E.

Digital image processing

. Prentic-Hall, New

Jersey, 2

nd

edn, 2002. ISBN 0-201-18075-8.

[G¨ otting

,

D.

,

Ibenthal

,

A.

,

Grigat

,

R.-R.

Fractal image coding and magnification using invariant features

. W NATO ASI Conference on Fractal Image

Encoding and Analysis, Trondheim, Norway, 1995.

[G¨ otting

,

D.

,

Ibenthal

,

A.

,

Grigat

,

R.-R.

Fractal image coding and magnification using invariant features

. Fractals, vol. 5 (Supplementary Issue), pp. 65–74,

1997.

[Ham97a]

Hamzaoui

,

R.

Codebook clustering by self-organizing maps for fractal image compression

. Fractals, vol. 5 (Supplementary Issue), pp. 27–38, 1997.

[Ham97b]

Hamzaoui

,

R.

Ordered decoding algorithm for fractal image compression

. W

Proceedings of the International Picture Coding Symposium PCS’97, pp. 91–95,

Berlin, Germany, 1997.

[Har97]

Hartenstein

,

H.

,

Saupe

,

D.

,

Barthel

,

K.-U.

VQ-encoding of luminance parameters in fractal coding schemes

.

W Proceedings ICASSP-97 (IEEE International Conference on Acoustics, Speech and Signal Processing), vol. 4, pp.

2701–2704, Munich, Germany, 1997.

[Har00]

Hartenstein

,

H.

,

Ruhl

,

M.

,

Saupe

,

D.

Region-based fractal image compression

.

IEEE Transactions on Image Processing, vol. 9(7), pp. 1171–1184, 2000.

[H¨ urtgen

,

B.

,

Stiller

,

C.

Fast hierarchical codebook search for fractal coding of still images

. Proceedings of the SPIE — The International Society for Optical

Engineering, vol. 1977, pp. 397–408, 1993.

[H¨ urtgen

,

B.

,

Mols

,

P.

,

Simon

,

S. F.

Fractal transform coding of color images

.

W Proceedings of SPIE Visual Communications and Image Processing, (A. K.

Katsaggelos, ed.), vol. 2308, pp. 1683–1691, 1994.

[Jac89]

Jacquin

,

A. E.

Image coding based on a fractal theory of iterated contractive markov operators, part ii: Construction of fractal codes for digital images

. Raport tech. 91389-017, Georgia Institute of Technology, 1989.

[Jac90a]

Jacquin

,

A. E.

Fractal image coding based on a theory of iterated contractive image transformations

. Proceedings of the SPIE — The International Society for

Optical Engineering, vol. 1360, pp. 227–239, 1990. ISSN 0277-786X.

[Jac90b]

Jacquin

,

A. E.

A novel fractal block-coding technique for digital images

. Proceed-

Bibliography

166

[Jac92] ings of IEEE International Conference on Acoustics, Speech and Signal Processing

ICASSP-1990, vol. 4, pp. 2225–2228, 1990.

Jacquin

,

A. E.

Image coding based on a fractal theory of iterated contractive image transformations

. IEEE Transactions on Image Processing, vol. 1(1), pp.

18–30, 1992.

[Jac93]

Jacquin

,

A. E.

Fractal image coding: a review

.

Proceedings of the IEEE, vol. 81(10), pp. 1451–1465, 1993. ISSN 0018-9219.

[Kan96]

Kang

,

H.-S.

,

Kim

,

S.-D.

Fractal decoding algorithm for fast convergence

. Optical

Engineering, vol. 35(11), pp. 3191–3198, 1996.

[Kao91]

Kaouri

,

H.

Fractal coding of still images

. W IEE 6th International Conference on Digital Processing of Signals in Communications, pp. 235–239, 1991.

[Kof06]

Koff

,

D. A.

,

Shulman

,

H.

An overview of digital compression of medical images: can we use lossy image compression in radiology?

Canadian Association of

Radiologist, vol. 57, pp. 211–217, 2006.

[Kom95]

Kominek

,

J.

Algorithm for fast fractal image compression

. W Digital video compression: algorithms and technologies, (A. A. Rodriguez, R. J. Safranek, E. J.

Delp, eds.), vol. 2419, pp. 296–305, San Jose, CA, USA, 1995.

[Kwi01]

Kwiatkowski

,

J.

,

Kwiatkowska

,

W.

,

Kawa

,

K.

,

Kania

,

P.

Using fractal coding in medical image magnification.

W PPAM, (R. Wyrzykowski, J. Dongarra,

M. Paprzycki, J. Wasniewski, eds.),

Lecture Notes in Computer Science

, vol. 2328, pp. 517–525, Springer, 2001. ISBN 3-540-43792-4.

[Lee98]

Lee

,

C. K.

,

Lee

,

W. K.

Fast fractal image block coding based on local variances

.

IEEE Trans. Image Proc, vol. 7(6), pp. 888–891, 1998.

[Lep95]

Lepsøy

,

S.

,

Øien

,

G. E.

Fast attractor image encoding by adaptive codebook clustering

. W Fisher [Fis95a], pp. 177–197.

[Lin95a]

Lin

,

H.

,

Venetsanopoulos

,

A. N.

Fast fractal image coding using pyramids

. W

Proceedings of the 8th International Conference on Image Analysis and Processing

ICIAP ’95, (C. Braccini, L. D. Floriani, G. Vernazza, eds.),

Lecture Notes in

Computer Science

, vol. 974, pp. 649–654, Springer-Verlag, San Remo, Italy, 1995.

ISBN 3-540-60298-4.

[Lin95b]

Lin

,

H.

,

Venetsanopoulos

,

A. N.

A pyramid algorithm for fast fractal image compression

. W Proceedings of the 1995 International Conference on Image Processing ICIP ’95, vol. 3, pp. 596–599, IEEE Computer Society, Washington, DC,

USA, 1995. ISBN 0-8186-7310-9.

[Lin97]

Lin

,

H.

Fractal image compression using pyramids

, 1997. Doctor of Philosophy

Dissertation, University of Toronto.

[Lu97]

Lu

,

N.

Fractal imaging

. Academic Press, 1997.

[Man83]

Mandelbrot

,

B. B.

The fractal geometry of nature

. W. H. Freedman and Co.,

New York, 1983.

[MK94]

M. Kawamata

,

M. N.

,

Higuchi

,

T.

Multi-resolutioin tree search for iterated transformation theory-based coding

. Proceedings of the IEEE International Conference on Image Processing ICIP-94, vol. III, pp. 137–141, 1994.

Bibliography

167

[Mon92]

Monro

,

D. M.

,

Dudbridge

,

F.

Fractal approximation of image blocks

. W

Proceedings of ICASSP-1992 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 3, pp. 485–488, 1992.

[Mon93a]

Monro

,

D. M.

Class of fractal transforms

. Electronics Letters, vol. 29(4), pp.

362–363, 1993.

[Mon93b]

Monro

,

D. M.

Generalized fractal transforms: Complexity issues

. W Proceedings

DCC’93 Data Compression Conference, (J. A. Storer, M. Cohn, eds.), pp. 254–261,

IEEE Comp. Soc. Press, 1993. ISBN 0-8186-3392-1.

[Mon93c]

Monro

,

D. M.

A hybrid fractal transform

. W Proceedings of ICASSP-1993 IEEE

International Conference on Acoustics, Speech and Signal Processing, vol. 5, pp.

169–172, 1993. ISSN 0013-5194.

[Mon94a]

Monro

,

D. M.

,

Woolley

,

S. J.

Fractal image compression without searching

.

W Proceedings of ICASSP-1994 IEEE International Conference on Acoustics,

Speech and Signal Processing, vol. 5, pp. 557–560, Adelaide, 1994.

[Mon94b]

Monro

,

D. M.

,

Woolley

,

S. J.

Rate/distortion in fractal compression: order of transform and block symmetries

. W Proc. ISSIPNN’94 Intern. Symp. on Speech,

Image Processing and Neural Networks, vol. 1, pp. 168–171, Hong Kong, 1994.

[Mon95]

Monro

,

D. M.

,

Dudbridge

,

F.

Rendering algorithms for deterministic fractals

.

IEEE Compututer Graphics and Applications, vol. 15(1), pp. 32–41, 1995. ISSN

0272-1716.

[Mul03]

Mulopulos

,

G. P.

,

Hernandez

,

A. A.

,

Gasztonyi

,

L. S.

Jpeak signal to noise ratio performance comparison of jpeg and jpeg 2000 for various medical image modalities

. W 20th Symposium on Computer Applications in Radiology,

Boston, Massachusetts, 2003.

[Nov93]

Novak

,

M.

Attractor coding of images

. W Proceedings of the International

Picture Coding Symposium PCS’93, Lausanne, 1993.

[Och04]

Ochotta

,

T.

,

Saupe

,

D.

Edge-based partition coding for fractal image compression

. The Arabian Journal for Science and Engineering, Special Issue on Fractal and Wavelet Methods, vol. 29(2C), 2004.

[Oh03]

Oh

,

T. H.

,

Besar

,

R.

Jpeg2000 and jpeg: image quality measures of compressed medical images

. W 4th National Conference of Telecommunication Technology

(NCTT2003), vol. 001, pp. 31–35, Shah Alam, Malaysia, 2003.

[Øie91]

[Øie92]

Øien

,

G. E.

,

Lepsøy

,

S.

,

Ramstad

,

T. A.

An inner product space approach to image coding by contractive transformations

. W Proceedings of ICASSP-1991

IEEE International Conference on Acoustics, Speech and Signal Processing, pp.

2773–2776, 1991.

Øien

,

G. E.

,

Lepsøy

,

S.

,

Ramstad

,

T.

Reducing the complexity of a fractal-based image coder

. W Proceedings of the Vth European Signal Processing Conference

EUSIPCO’92, pp. 1353–1356, 1992.

[Øie93]

Øien

,

G. E.

l

2

optimal attractor image coding with fast decoder convergence

, 1993.

Doctor of Philosophy Dissertation, The Norwegian Institute of Technology.

[Øie94a]

Øien

,

G. E.

Parameter quantization in fractal image coding

. W Proc. ICIP-94

Bibliography

168

IEEE International Conference on Image Processing, vol. III, pp. 142–146, Austin,

Texas, 1994.

[Øie94b]

Øien

,

G. E.

,

Baharav

,

Z.

,

Lepsøy

,

S.

,

Karnin

,

E.

,

Malah

,

D.

A new improved collage theorem with applications to multiresolution fractal image coding

.

W Proceedings ICASSP-94 (IEEE International Conference on Acoustics, Speech and Signal Processing), vol. 5, pp. 565–568, Adelaide, Australia, 1994.

[Pin04]

[Pol01]

Pinhas

,

A.

,

Greenspan

,

H.

A continuous and probabilistic framework for medical image representation and categorization

. Proceedings of SPIE Medical Imaging, vol. 5371, pp. 230–238, 2004.

Polidori

,

E.

,

Dugelay

,

J.-L.

Zooming using iterated function systems

. Fractals, vol. 5, pp. 111 – 123, 2001. Supplementary Issue 1.

[Pop93]

Popescu

,

D. C.

,

Yan

,

H.

MR Image compression using iterated function systems

. Magnetic Resonance Imaging, vol. 11, pp. 727–732, 1993.

[Prz02]

Przelaskowski

,

A.

Kompresja danych

.

http://www.ire.pw.edu.pl/~arturp/

Dydaktyka/koda/skrypt.html

, 2002. Internet textbook.

[Ram76]

Ramapriyan

,

H. K.

A multilevel approach to sequential detection of pictorial features

. IEEE Transactions on Computers, vol. 25(1), pp. 66–78, 1976.

[Ram86]

Ramamurthi

,

B.

,

Gersho

,

A.

Classified vector quantization of images

. IEEE

Transaction on Communication, vol. 34, pp. 1105–1115, 1986.

[Reu94]

Reusens

,

E.

Partitioning complexity issue for iterated function systems based image coding

. W Proceedings of the VIIth European Signal Processing Conference

EUSIPCO’94, vol. I, pp. 171–174, Edinburgh, 1994.

[Ros84]

Rosenfeld

,

A.

, ed..

Springer-Verlag, 1984.

Multiresolution image processing and analysis

.

[Ruh97]

Ruhl

,

M.

,

Hartenstein

,

H.

,

Saupe

,

D.

Adaptive partitionings for fractal image compression

. W Proceedings ICIP-97 (IEEE International Conference on Image

Processing), vol. II, pp. 310–313, Santa Barbara, CA, USA, 1997.

[Sam90]

Samet

,

H.

The design and analysis of spatial data structures

. Addison-Wesley

Longman Publishing Co., Inc., Boston, MA, USA, 1990. ISBN 0-201-50255-0.

[Sau95a]

Saupe

,

D.

Accelerating fractal image compression by multi-dimensional nearest neighbor search

. W Proceedings of the IEEE Data Compression Conference

DCC’95, (J. A. Storer, M. Cohn, eds.), pp. 222–231, Snowbird, UT, USA, 1995.

[Sau95b]

Saupe

,

D.

Fractal image compression via nearest neighbor search

. W NATO ASI on Fractal Image Encoding and Analysis, Trondheim, Norway, 1995.

[Sau95c]

Saupe

,

D.

From classification to multi-dimensional keys

. W Fisher [Fis95a], pp.

302–305.

[Sau96a]

Saupe

,

D.

The futility of square isometries in fractal image compression

. W

Proc. ICIP-96 IEEE International Conference on Image Processing, vol. I, pp.

161–164, Lausanne, 1996.

[Sau96b]

Saupe

,

D.

Lean domain pools for fractal image compression

. W Proceedings of IS&T/SPIE, (R. L. Stevenson, A. I. Drukarev, T. R. Gardos, eds.), vol. 2669:

Bibliography

169

1996 Symposium on Electronic Imaging: Science & Technology - Still Image Compression II, pp. 150–157, San Jose, California, USA, 1996.

[Sau96c]

Saupe

,

D.

,

Hamzaoui

,

R.

,

Hartenstein

,

H.

Fractal image compression - an introductory overview

. W Fractal Models for Image Synthesis, Compression and

Analysis, ACM SIGGRAPH’96 Course Notes, (D. Saupe, J. Hart, eds.), vol. 27,

New Orleans, Louisiana, USA, 1996.

[Sau96d]

Saupe

,

D.

,

Ruhl

,

M.

Evolutionary fractal image compression

. W Proc. ICIP-96

IEEE International Conference on Image Processing, vol. I, pp. 129–132, Lausanne, Switzerland, 1996.

[Sau98]

Saupe

,

D.

,

Ruhl

,

M.

,

Hamzaoui

,

R.

,

Grandi

,

L.

,

Marini

,

D.

Optimal hierarchical partitions for fractal image compression

. W Proc. ICIP-98 IEEE International Conference on Image Processing, Chicago, 1998.

[Sig97]

Signes

,

J.

Geometrical interpretation of IFS based image coding

. Fractals, vol.

5 (Supplementary Issue), pp. 133–143, 1997.

[Sta04]

[Ste93]

Starosolski

,

R.

Przeglad metod bezstratnej kompresji obrazów medycznych

. Studia Informatica, vol. 25(2), pp. 49 – 66, 2004.

Stevenson

,

R.

Reduction of coding artifacts in transform image coding

. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, vol. 5, pp. 401–404, 1993.

[Tan96]

Tanimoto

,

M.

,

Ohyama

,

H.

,

Kimoto

,

T.

A new fractal image coding scheme employing blocks of variable shapes

. W Proceedings ICIP-96 (IEEE International

Conference on Image Processing), vol. 1, pp. 137–140, Lausanne, Switzerland,

1996.

[Tat92]

Tate

,

S. R.

Lossless compression of region edge maps

. Raport tech. Technical report DUKE–TR–1992–09, Duke University, Durham, NC, USA, 1992.

[Tho95]

Thomas

,

L.

,

Deravi

,

F.

Region-based fractal image compression using heuristic search

. IEEE Transactions on Image Processing, vol. 4(6), pp. 832–838, 1995.

ISSN 1057-7149.

[Vin95]

Vines

,

G.

Orthogonal basis ifs

. W Fisher [Fis95a].

[Wak97]

Wakefield

,

P.

,

Bethel

,

D.

,

Monro

,

D. M.

Hybrid image compression with implicit fractal terms

. W ICASSP ’97: Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’97)

-Volume 4, p. 2933, IEEE Computer Society, Washington, DC, USA, 1997. ISBN

0-8186-7919-0.

[Wei96]

Wein

,

C. J.

,

Blake

,

I. F.

On the performance of fractal compression with clustering.

IEEE Transactions on Image Processing, vol. 5(3), pp. 522–526, 1996.

[Wei98]

Weisfield

,

R. L.

Amorphous silicon tft x-ray image sensors

. IEEE IEDM’98

Technical Digest, pp. 21–24, 1998.

[Woh95]

Wohlberg

,

B.

, de Jager

,

G.

Fast image domain fractal compression by DCT domain block matching

. Electronics Letters, vol. 31(11), pp. 869–870, 1995.

[Woh99]

Wohlberg

,

B.

, de Jager

,

G.

A review of the fractal image coding literature

.

IEEE Transactions on Image Processing, vol. 8(12), pp. 1716–1729, 1999.

Bibliography

170

[Woo94]

Woolley

,

S. J.

,

Monro

,

D. M.

Rate-distortion performance of fractal transforms for image compression

.

Fractals, vol. 2(3), pp. 395–398, 1994.

ISSN

0218-348X.

[Woo95]

Woolley

,

S. J.

,

Monro

,

D. M.

Optimum parameters for hybrid fractal image coding

. W Proceedings of ICASSP-1995 IEEE International Conference on

Acoustics, Speech and Signal Processing, Detroit, 1995.

[Wu91]

Wu

,

X.

,

Yao

,

C.

Image coding by adaptive tree-structured segmentation

. IEEE

Transactions on Information Theory, pp. 73–82, 1991.

[Zak92]

Zakhor

,

A.

Iterative procedures for reduction of blocking effects in transform image coding

. IEEE Transactions on Circuits and Systems for Video Technology, vol. 2(1), pp. 91–95, 1992.

[Zha98]

Zhao

,

Y.

,

Yuan

,

B.

A new affine transformation: its theory and application to image coding

. IEEE Transactions on Circuits and Systems for Video Technology, vol. 8(3), pp. 269–274, 1998.

Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement

Table of contents