02_R Programming for Data Science

1. Load code to R
   (1) copy code to console
   (2) put file into working directory, then source("____.R")

2. Operators
   (1) <-   : assignment operator
   (2) #    : hash symbol(comment)
   (3) :    : colon operator(integer sequence)
   (4)+-*/^ : arithmetic operator
   (5)> < ==: logical operator
      >= <=
      & | !
      && ||
        
3. R Objects
   (1) classes: character, numeric, integer, complex, logical
   (2) vector: class of elements must be same
   (3) list: class of elements can be different

4. Sequence number
   (1): create a sequence of numbers
   (2)seq(from,to,by/length=,along.with=take the length from the length of this argument) 
   (3)seq_along()
   (4)rep(,times/each=)

5. Vector
   (1) c/vector()

   List
   (1) list()

6. Matrix
   (1) matrix(,nrow=,ncol=)
   (2) dim(x)<-c(4,5): give a vector a `dim` attribute

7. Dataframe
   (1) data.frame(name=c(),...)
   (2) read.table/read.csv()

8. Missing values
   (1) NA: not available
   (2) NAN: not a number
   (3) Inf: infinity (Inf-Inf=NAN)
    A NAN is NA, but the converse is not.

9. Subsetting
   (1) x[index vector c(position/negative interger indexes)/logical expression/named elements vector]
   (2) []: return same class(matrix return vector)
   (3) [[]]: return different class(sequence)
   (4) [c()]: return multiple elements

10. basic functions
    (01) sqrt(), abs()
    (02) length(), dim()
    (03) names(), colnames()
    (04) class(), attributes(), args()
    (05) cbind(), rbind()
    (06) nrow(), ncol()
    (07) head(,nrows), tail()
    (08) str(): understand the structure of something
    (09) sum(), identical()
    (10) paste(,collapse=" ") join the elements of character vector together into one continuous character string
         paste(,,sep=" ") join two character vectors together
    (11) LETTERS a predefined variable in R containing a character vector of all 26 letters in the English alphabet
    (12) ls(): list all the objects in your local workspace
         rm(): remove objects in your local workspace
?command access R's built-in help files(?':')
    (13) unique(): returns a vector of only the 'unique' elements.
         object.size(): how much space the dataset is occupying in memory
    (14) replicate(), colMeans(), hist(): plotting a histogram

11.
    (1) isTRUE()/xor() exclusive OR
    (2) which() takes a logical vector as an argument and returns the indices of the vector that are TRUE.
    (3) any() return TRUE if one or more of the elements in the logical vector is TRUE.
    (4) all() return TRUE if every element in the logical vector is TRUE.

12. Loop
    (1) for(i in 1:10)
    (2) lapply(list, function)
    (3) sapply(list, function) if the result is a list where every element is of length one, then sapply() returns a vector. If the result is a list where every element is a vector of the same length (> 1), sapply() returns a matrix. If sapply() can't figure things out, then it just returns a list
    (4) vapply(list, function, format numeric(1)) specify format of the result explicitly
    (5) tapply(list, factor, function) split your data up into groups based on the value of some variable, then apply a function to the members of each group.
    (6) split()

13. name <- function(a=TRUE){
variable name(return)
    }

14. Scoping rules
    (1) lexical scoping: function environment->top-level->empty
    (2) dynamic scoping: function environment->calling

15. Simulation
    (1) sample(range,num,replace show again,prob probilitis) generate random numbers
    (2) rbinom(1,size=,prob=) simulate a binomial random variable.
    (3) rnorm(,mean=,sd=) generate random numbers from a standard normal distribution.
    (4) rpois(,lambda=) Generate random values from a Poisson distribution

16. Date
    (1) Sys.Date() get the current date 
    (2) as.Date("yyyy-mm-dd")

17. Time
    (1) Sys.Time() get the current time 
    (2) POSIXct()
        POSIXlt() a list of values that make up the date and time.

18. 
    (1) weekdays(), months(),quarters()
    (2) strptime() converts character vectors to POSIXlt.
    (3) difftime(time1,time2,units='days') difference in times over the units

19. Debugging
    (1) traceback: print out the function call stack
    (2) debug: flag a function for "debug" mode, you can execute one line at a time
    (3) browser: suspend the execution of a function wherever it is called
    (4) trace: insert debugging code into a fucntion
    (5) recover: modify the error behavior

20. Profiling R code
    (1) Sys.time()
    (2) Rprof(), SummeryRprof()

版权声明:本文为lihaosnoopy原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。