Overview of this blog

This week, we had two lessons: one for 2.2 and one for 2.3. This blog includes the hacks that we were assigned to do for 2.2, which was all about data compression (lossy/lossless) and images (grey scaling, resizing, etc.). I plan to use this as one of my study tools as we begin to prepare for the AP Exam on May 8.

Early Seed Award

For the secret seed award, we had to generate an image of a smiley face using PIL, as you cannot just simply insert an image url into a python notebook. We were then asked to show the smiley face appearing on our blog (ran locally). Below is an image of exactly that (generated from PIL!)

from PIL import Image
proof = Image.open('../images/proofthatitworks.png')

display(proof)

AP Prep

Lossy Image and Lossless Image

Below are two images. One of them is more likely to result in lossy compression while one of them is more likely to result in lossless compression. Without scrolling down, try to guess which one is which!

from PIL import Image
smile = Image.open('../images/smileyface.png')

display(smile)
from PIL import Image
green = Image.open('../images/green-square-16.png')

new_green = green.resize((700,700))
new_green.save('../images/green-square-320.png')



display(new_green)

The image with the CREEPY smiley face would likely result in lossy compression. As you can see, the image does not just consist of yellow; it also has black and some parts that are even white. Therefore, reducing the image size to free up space would make a more noticeable reduction in the overall quality of the image.

The image with the green square, on the other hand, would likely result in lossless compression. As you can clearly see, the image is simply a green background and nothing else. Thus, reducing the size of the image while still maintaining its original quality will not be an issue, as using lossless compression on this image will have a much smaller impact on the overall quality of the image.

2.2 College Board Practice Problems

The college board practice problems were essentially the quiz questions for 2.2 available on College Board. The questions are shown below, along with an image of my final score:

(1) Which of the following is an advantage of a lossless compression algorithm over a lossy compression algorithm?

Possible Answers:

(A) A lossless compression algorithm can guarantee that compressed information is kept secure, while a lossy compression algorithm cannot.

(B) A lossless compression algorithm can guarantee reconstruction of original data, while a lossy compression algorithm cannot.

(C) A lossless compression algorithm typically allows for faster transmission speeds than does a lossy compression algorithm.

(D) A lossless compression algorithm typically provides a greater reduction in the number of bits stored or transmitted than does a lossy compression algorithm.

Correct answer: B

(2) A user wants to save a data file on an online storage site. The user wants to reduce the size of the file, if possible, and wants to be able to completely restore the file to its original version. Which of the following actions best supports the user’s needs?

Possible Answers:

(A) Compressing the file using a lossless compression algorithm before uploading it

(B) Compressing the file using a lossy compression algorithm before uploading it

(C) Compressing the file using both lossy and lossless compression algorithms before uploading it

(D) Uploading the original file without using any compression algorithm

Correct answer: A

(3) A programmer is developing software for a social media platform. The programmer is planning to use compression when users send attachments to other users. Which of the following is a true statement about the use of compression?

Possible Answers:

(A) Lossless compression of video files will generally save more space than lossy compression of video files.

(B) Lossless compression of an image file will generally result in a file that is equal in size to the original file.

(C) Lossy compression of an image file generally provides a greater reduction in transmission time than lossless compression does.

(D) Sound clips compressed with lossy compression for storage on the platform can be restored to their original quality when they are played.

Correct answer: C

from PIL import Image
print("SCORE: ")
score = Image.open('../images/datacompressionscore.png')

display(score)
SCORE: 

As shown in the above image, I earned a 3/3 (100%) on the Data Compression quiz, which indicates that I have a very good understanding of what we learned this week. Generally speaking, as I was completing this quiz, I really had no trouble answering any of the questions and none of them really forced me to think for a while before arriving at the correct answer. While it is great I earned a good score on this mini quiz, it is also important that I refer back to this quiz along with many of the other tests and quizzes that we have taken on College Board, as all of these will definitely serve as useful study tools for the AP exam. It is also important that I continue to practice more on questions like the one on this quiz, as one quiz may not always be enough to indicate that I am strong in answering these kinds of questions. Overall, I am very happy with my score and believe that this quiz will be useful in helping me study for the AP exam coming up.

2.2 Notes/Observations/Answers to Any Questions

base64 questions

  • How is Base64 similar or different to Binary and Hexadecimal?

In Base 64, there are 64 possible characters, while in binary and hexadecimal, there are only 8 and 16 characters, respectively.

  • Translate first 3 letters of your name to Base64.

My name: Emaad First three letters: Ema

45 6D 61

E M A

buffering questions

  • Where have you been a consumer of buffering?

I have been a consumer of buffering when it comes to transferring pictures from my camera and uploading them on to my Google Photos account. Buffering involves temporarily storiing data as one moves it from one place to another, so in that sense, the images that I want added to my Google Photos serve as the data being stored.

  • From your consumer experience, what effects have you experienced from buffering?

With buffering, I have been able to retain the original resolution of the desired images while still moving the image data from one place to another.

Effects

  • How do these effects apply to images?

In my experience, I have used buffers to change some attributes of the image, such as the size and scale, the format, the file type (png, jpg, heic, etc.) and many more. While I do end up getting the desired product, whether it was changing the size or file type, I have found that it can sometimes reduce the quality of the image slightly and make the image look a little "off" from the original. On the surface, it may not look like anything has changed, but when you look closely, one can begin to clearly notice the differences.

grey scale question

  • Does this code seem like a series of steps are being performed?

Yes, this code definitely acts as a series of steps being performed, as there are several functions and procedures tat go into making the image tinted grey.

  • Describe scale image? What is before and after on pixels in three images?

The scale for each image changes depending on what it was initially, For example, originally, the green square image was smaller, but using the scale function, it was made much larger. With the scale for the second image, there was no visible change (may have been made slightly larger), however, for the third image (largest image), the image was shrunk down a lot.

  • Describe Grey Scale algorithm in English or Pseudo code?

(1) Import the necessary libraries, such as PIL, base64, and numpy.

(2) Create a list of images that you would like to make grey (or whatever the desired color is)

(3) Create functions that will resize/rescale the image however you want it (shrink it, enlarge it, keep it the same, etc.)

(4) Convert the images into base64

(5) Create the function that will play the main role of tinting the images grey

(6) For every pixel in the image, if the length of the pixel list is greater than 3, then the grey data will be appended and will be updated accordingly

(7) Print both the original image and the grey image with the different size, scale, etc. so that the user can see what changed between the two

  • Is scale image a type of compression? If so, line it up with College Board terms described?

Yes, scale image is a type of compression in that it is a lossless data compression. Lossless data compression involves resizing the images into a smaller version, which aligns with how you can scale the image in order to make it smaller while still retaining its data and resolution.

2.2 Programming Paradig (Red/Green/Blue Scaling Implementation, research, etc.)

Lossy and Lossless Image Compression Research

While the 2.2 Blog briefly defines lossy and lossless compression, it is important that we acknowledge that both kinds of compression have their advantages and disadvantages. It is also important that we understand some specific examples of lossy or lossless data compression so that if we get a question regarding compression that we know which kind would work better and why.

Lossy Compression: Notes + Advantages + Examples

  • Lossy compression algorithms reduces the file size by removing some of the original data
    • Removes data that is generally considered to be "less important"
  • Used when storage space needs to be "freed up"
  • Advantages + Disadvantages of Lossy Compression
    • Advantages
      • Smaller file size takes up less space, which allows more storage to be left over
      • Faster transmission time
    • Disadvantages
      • Quality of the image/file diminishes due to some of the data being removed
      • Original file's data cannot be reconstructed
  • When to use lossy data compression
    • Updating your website: using lossy would be best here, as a smaller file size will produce faster load times
    • Freeing up space: If you have used up almost all of your space and need to free up some, lossy works best if you don't care about the quality of the images

Lossless Compression: Notes + Advantages + Examples

  • Lossless data compression reduces file size by removing unnecessary metadata while still restoring the original file's data
    • Allows the image to take up less space while having little to no impact on the quality of the image
  • No data is lost
  • Advantages + Disadvantages of Lossless Compression
    • Advantages
      • Smaller file size while still maximizing its original quality
    • Disadvantages
      • While lossless data compression reduces the file size, it only does so slightly, meaning that it takes up more space
        • Due to the original quality of the image being restored
  • When to use lossless data compression
    • When you want to change the image file type: if you want to convert, say, a jpg image to png, lossless data compression would work best to maximize the quality of the image
    • Including a large image: if you want to insert an image that is large in size while still maintaining its original quality, lossless data compression would definitely work best

Red/Green/Blue Scaling with Numpy

from IPython.display import HTML, display
from pathlib import Path  # https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f
from PIL import Image as pilImage # as pilImage is used to avoid conflicts
from io import BytesIO
import base64
import numpy as np


class Image_Data:

    def __init__(self, source, label, file, path, baseWidth=320):
        self._source = source    # variables with self prefix become part of the object, 
        self._label = label
        self._file = file
        self._filename = path / file  # file with path
        self._baseWidth = baseWidth

        # Open image and scale to needs
        self._img = pilImage.open(self._filename)
        self._format = self._img.format
        self._mode = self._img.mode
        self._originalSize = self.img.size
        self.scale_image()
        self._html = self.image_to_html(self._img)
        self._html_grey = self.image_to_html_grey()


    @property
    def source(self):
        return self._source  
    
    @property
    def label(self):
        return self._label 
    
    @property
    def file(self):
        return self._file   
    
    @property
    def filename(self):
        return self._filename   
    
    @property
    def img(self):
        return self._img
             
    @property
    def format(self):
        return self._format
    
    @property
    def mode(self):
        return self._mode
    
    @property
    def originalSize(self):
        return self._originalSize
    
    @property
    def size(self):
        return self._img.size
    
    @property
    def html(self):
        return self._html
    
    @property
    def html_grey(self):
        return self._html_grey
        
    # Large image scaled to baseWidth of 320
    def scale_image(self):
        scalePercent = (self._baseWidth/float(self._img.size[0]))
        scaleHeight = int((float(self._img.size[1])*float(scalePercent)))
        scale = (self._baseWidth, scaleHeight)
        self._img = self._img.resize(scale)
    
    # PIL image converted to base64
    def image_to_html(self, img):
        with BytesIO() as buffer:
            img.save(buffer, self._format)
            return '<img src="data:image/png;base64,%s">' % base64.b64encode(buffer.getvalue()).decode()
            
    # Create Grey Scale Base64 representation of Image
    def image_to_html_grey(self):
        img_grey = self._img
        numpy = np.array(self._img.getdata()) # PIL image to numpy array
        
        grey_data = [] # key/value for data converted to gray scale
        # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted
        for pixel in numpy:
            # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/
            average = (pixel[0] + pixel[1] + pixel[2]) // 3  # average pixel values and use // for integer division
            if len(pixel) > 3:
                grey_data.append((average, average, average, pixel[3])) # PNG format
            else:
                grey_data.append((average, average, average))
            # end for loop for pixels
            
        img_grey.putdata(grey_data)
        return self.image_to_html(img_grey)
    
    def image_to_html_red(self):
        img_red = self._img.copy()  # create a copy of the image
        numpy = np.array(self._img.getdata())

        # set green and blue channels to 0
        red_data = [] # key/value for data converted to gray scale
        # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted
        for pixel in numpy:
            # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/
            # average pixel values and use // for integer division

            if len(pixel) > 3:
                red_data.append((pixel[0], 0, 0, pixel[3])) # PNG format
            else:
                red_data.append((pixel[0],0,0))
        
        # create a new image with the modified numpy array
        img_red.putdata(red_data)
        return self.image_to_html(img_red)

    def image_to_html_green(self):
        img_green = self._img.copy()  # create a copy of the image
        numpy = np.array(self._img.getdata())

        # set green and blue channels to 0
        green_data = [] # key/value for data converted to gray scale
        # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted
        for pixel in numpy:
            # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/
            # average pixel values and use // for integer division

            if len(pixel) > 3:
                green_data.append((0, pixel[1], 0, pixel[3])) # PNG format
            else:
                green_data.append((0, pixel[1], 0))
        
        # create a new image with the modified numpy array
        img_green.putdata(green_data)
        return self.image_to_html(img_green)

    def image_to_html_blue(self):
        img_blue = self._img.copy()  # create a copy of the image
        numpy = np.array(self._img.getdata())

        # set green and blue channels to 0
        blue_data = [] # key/value for data converted to gray scale
        # 'data' is a list of RGB data, the list is traversed and hex and binary lists are calculated and formatted
        for pixel in numpy:
            # create gray scale of image, ref: https://www.geeksforgeeks.org/convert-a-numpy-array-to-an-image/
            # average pixel values and use // for integer division

            if len(pixel) > 3:
                blue_data.append((0, 0, pixel[2], pixel[3])) # PNG format
            else:
                blue_data.append((0, 0, pixel[2]))
        
        # create a new image with the modified numpy array
        img_blue.putdata(blue_data)
        return self.image_to_html(img_blue)

        
# prepares a series of images, provides expectation for required contents
def image_data(path=Path("images/"), images=None):  # path of static images is defaulted
    if images is None:  # default image
        images = [
            {'source': "Internet", 'label': "Green Square", 'file': "green-square-16.png"},
            {'source': "Peter Carolin", 'label': "Clouds Impression", 'file': "clouds-impression.png"},
            {'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "lassen-volcano.jpg"},
            {'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "smileyface.png"}
        ]
    return path, images

# turns data into objects
def image_objects():        
    id_Objects = []
    path, images = image_data()
    for image in images:
        id_Objects.append(Image_Data(source=image['source'], 
                                  label=image['label'],
                                  file=image['file'],
                                  path=path,
                                  ))
    return id_Objects

# Jupyter Notebook Visualization of Images
if __name__ == "__main__":
    for ido in image_objects(): # ido is an Imaged Data Object
        
        print("---- meta data -----")
        print(ido.label)
        print(ido.source)
        print(ido.file)
        print(ido.format)
        print(ido.mode)
        print("Original size: ", ido.originalSize)
        print("Scaled size: ", ido.size)
        
        print("-- scaled image --")
        display(HTML(ido.html))
        
        print("--- grey image ---")
        display(HTML(ido.html_grey))

        print("--- red image ---")
        display(HTML(ido.image_to_html_red()))

        print("--- green image ---")
        display(HTML(ido.image_to_html_green()))

        print("--- blue image ---")
        display(HTML(ido.image_to_html_blue()))

        
    print()
---- meta data -----
Green Square
Internet
green-square-16.png
PNG
RGBA
Original size:  (16, 16)
Scaled size:  (320, 320)
-- scaled image --
--- grey image ---
--- red image ---
--- green image ---
--- blue image ---
---- meta data -----
Clouds Impression
Peter Carolin
clouds-impression.png
PNG
RGBA
Original size:  (320, 234)
Scaled size:  (320, 234)
-- scaled image --
--- grey image ---
--- red image ---
--- green image ---
--- blue image ---
---- meta data -----
Lassen Volcano
Peter Carolin
lassen-volcano.jpg
JPEG
RGB
Original size:  (2792, 2094)
Scaled size:  (320, 240)
-- scaled image --
--- grey image ---
--- red image ---
--- green image ---
--- blue image ---
---- meta data -----
Lassen Volcano
Peter Carolin
smileyface.png
PNG
RGBA
Original size:  (639, 517)
Scaled size:  (320, 258)
-- scaled image --
--- grey image ---
--- red image ---
--- green image ---
--- blue image ---