I have twice tried to get comfortable and competent at using the open-source statistical software package R. I have twice been unsuccessful in this goal. It is not from lack of recognition of the many advantages of R over other statistical-analytical approaches – at this point, I can think of several people who would immediately start telling me about how great R is, I am trying to forestall that – it is a combination of the interrelated factors: lack of success in early stages, total unfamiliarity with the structure of R’s input/output language (and help files), and instructions and tutorials that seem targeted at users with completely different priorities than I have.
Priorities: I need to learn to use R for its two primary
purposes – statistical analyses of my data and graphical representation of my
data and my results. Obviously, this overall priority is the same as for most R
users. What I don’t need to prioritize, in my opinion, is what the textbooks,
help files, and most R-related websites put first: a list of commands and
functions in R that are of broad general use. The problem for me there is that
I cannot easily think of a situation in which I would be (for example) using a
set of columns in my dataset, indicated by number. Yet such functions are
commonly presented in the opening chapters of any discussion of R. The first
thing I want to do is put my dataset, up to this point in the form of an Excel
worksheet, into R, and then have a look at it to confirm it loaded correctly.
Picking out values in specific places – e.g. the third value in the fifth
variable – is indeed a useful way to do some of that. But not before I’ve loaded my dataset!
This is why I am happily reading Getting Started With R, AnIntroduction for Biologists, by Andrew P. Beckerman and Owen L. Petchey (Amazon link to the Kindle edition; I don't know why it's not finding the paperback I bought). This book, unlike all others I have met, starts with organizing your
data and getting R up and running. I’ve read the first two chapters (of 6, it’s
a small book) and my confidence is already improved. Getting demoralized by
apparently deeply mysterious errors that just lead to more errors and confusion
is a big part of why I’ve abandoned R in the past.
No comments:
Post a Comment