Use the principles of data visualization introduced in
week #2 to complete this week's assignment. The elements you are expected to consider are
a. Marks and channels: point shapes, size, colors, etc.
b. Legends and annotations (text and images): font size and face, colors, positions, etc.
c. Title and tick mark labels: font size and face, size, colors
Create a subset from the penguin data set (the one used in the last assignment) that satisfies the following conditions
a. Delete all records with at least one missing component
b. Include Adelie penguins and Gentoo penguins from the Biscoe and Torgersen islands.
c. Include only penguins with body_mass_g less than 5000 grams but more than 3500 grams.
d. Rescale body_mass_g by dividing 4000 and rename it as BMI = body_mass_g / 4000.
e. Exclude variables X(observation ID), sex, year, and body_mass_g from the above subset.
f. Based on the above-resulting data set,
1). Create a scatter plot of Bill_length_mm and Flipper_length_mm.
2). Use different colors to indicate the species of penguins.
3). Make sure the point size is proportional to the body mass index (BMI)
4). (optional)Place a regression line for each individual species of penguin (make sure the color of the species-specific regression line should be identical to the color of the points).
5). Write a paragraph to compare the relationship between the two variables across the species.