This note introduces software programs and platforms that
could be used in this course.
R is a language and environment for statistical computing and
graphics. It is a GNU project which is similar to the S language and
environment which was developed at Bell Laboratories (formerly AT&T,
now Lucent Technologies) by John Chambers and colleagues. R can be
considered as a different implementation of S. There are some important
differences, but much code written for S runs unaltered under R.
R is an integrated suite of software facilities for data manipulation, calculation, and graphical display. It includes
RStudio is an integrated development environment (IDE) for R. It includes a console, and syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging, and workspace management.
There are two versions of RStudio: RStudio Desktop and RStudio Server. Both versions have free open-source and commercial editions. We use the free open-source edition of RStudio Desktop that has the following features:
R and RStudio are two distinctly different applications that serve different purposes. R is a programming language used for statistical computing while RStudio uses the R language to develop statistical programs.
R and RStudio are not separate versions of the same program and cannot be substituted for one another. R may be used without RStudio, but RStudio may not be used without R.
First of all, you need to sign up for an account with RPubs if you
don’t have one. Otherwise, sign in to your existing RPubs account. The
following two hyperlink buttons will bring you to the appropriate
website.
Your deed to install a recent version of You’ll need R
itself, RStudio
, and the knitr
package on your
machine.
In RStudio, create a new R Markdown document by choosing
File
| New
| R Markdown
.
Click the Knit HTML
button in the doc toolbar to
preview your document.
In the preview window, click button.
GitHub is a social networking site for programmers to share their code. Many companies and organizations use it to facilitate project management and collaboration. It is the most prominent source code host, with over 60 million new repositories.
Most importantly, it is free. We can also use this resource to host web pages. Many images and data sets that I used are stored on GitHub.
You can use the following two buttons to sign up for an account with Github or sign in to an existing Github account.
We will use screenshots to demonstrate how to create a repository, folders, and files.
sta553
, We want
folders for different files. To create a folder under
sta553
, click the hyperlink `creating a new filedata
folder
which will be used to store data files. After typing “data/”, a new box
appears under the “data” folder, type the first file name - readme, and
the content of the file (see the screenshot). In the end, click the
green button “Commit new file” to complete the creation of the first
folder in the repository data
.data
folder, we click the
drop-down menu on the top right corner and select `upload filessta553
, we click
Creating New File
, and we can create a new folder
image
similarly.New repository
to create a new
repository.SAS OnDemand provides free data management and data analysis tools. The advantage of SAS OnDemand is that it does not require any installation and it runs on the cloud via the internet and process data by connecting to the SAS server in the cloud. In other words, your computer is only used as a monitor since it does not use any resources (memory and CPU) of your computer.
Click Access
to enter the SAS OnDemand login page.
If you have already created your SAS Profile, use the email or user ID and the password to log into the SAS OnDemand page.
If you don’t have a SAS profile, click the link
Don't have a SAS profile?
, and you will have the following
pop-up dialogue box. Click Create profile
, then you will
see a pop-up sign-up page. You then follow the direction to create your
SAS profile.
Provide your profile information to log into the OnDemand page, you will see the link to the SAS Studio user interface and your account information as well.
Once you created a SAS profile, you will have 5 GB of free storage.
In the Applications
tab, click SAS Studio
,
and you see the SAS Studio user interface on a separate page (it may
take a little bit of time to initialize your account if you use it for
the first time).
The above screenshot was taken from my SAS course webpage. For those who learned SAS using the classical SAS, you will see SAS Studio is much more convenient and easier to use.
SAS Studio (Academic OnDemand) is installed on
SAS servers hosted in the Microsoft Azure Cloud. Although SAS claims
that your assigned storage is private and secured, it is suggested to
avoid uploading sensitive data
to your private storage on
the SAS server since SAS does not release the level of security for the
storage.
The following libraries will be used throughout this class.
Ggplot2 is a system for creating charts based on the Grammar of Graphics. It proved to be one of the most powerful R libraries for visualization.
Plotly
is an online platform for data visualization in R
(also available in Python). This package creates interactive web-based
plots using plotly.js
library. Plotly gives users an
opportunity to interact with graphs, change their scale and point out
the necessary record. The library also supports graph hovering.
Moreover, one can easily add Plotly in knitr/R Markdown or Shiny
apps.
Leaflet is a well-known package based on JavaScript libraries for interactive maps. It is widely used for mapping and working with the customization and design of interactive maps. Besides, Leaflet provides an opportunity to make these maps mobile-friendly.
ggmap
, map
, dygraph
,