没有合适的资源?快使用搜索试试~ 我知道了~
首页《R for Everyone》:掌握R语言的实用指南
《R for Everyone》:掌握R语言的实用指南
5星 · 超过95%的资源 需积分: 9 47 下载量 189 浏览量
更新于2024-07-18
收藏 34.6MB PDF 举报
"《R for Everyone》是一本专为初学者和广大R语言爱好者设计的优秀指南,它深入浅出地介绍了R语言的基础知识与高级应用。该书不仅涵盖了R语言的基本语法、数据处理、统计分析、图形绘制等内容,还注重实践操作,通过大量实例帮助读者掌握R语言的实际应用技巧。 作为一本电子书,它采用流行的ePub格式,这是一种开放且行业标准的电子阅读格式。然而,不同的阅读设备和应用程序对ePub的支持可能存在差异。用户可以根据自己的喜好在设备或应用中进行个性化设置,如字体、字号、单列或双列布局、横竖模式以及可点击放大图片等。为了获取更多关于设备或应用的设置和功能信息,建议访问制造商的官方网站获取详细指导。 书中包含丰富的编程代码和配置示例,为了优化这些元素的展示效果,推荐读者选择单列模式和横屏阅读,并将字体大小调整至最小。在呈现代码和配置时,除了采用可流动的文本格式,还提供了与印刷版一致的图片,以保持代码列表的清晰度。如果流动文本格式影响了代码的视觉效果,读者会在合适的位置看到“点击此处查看代码图片”的链接。点击链接后,可以查看原汁原味的打印版代码图像。返回到先前的页面时,只需点击Back按钮即可。 《R for Everyone》是一本既适合自学R语言的初学者,也适合有一定基础的用户深入理解R语言的实用参考书籍。无论是理论学习还是实战项目,它都能提供强大的支持,帮助读者逐步提升R语言技能,探索数据科学的世界。"
资源详情
资源推荐
packages has been made incredibly easy with the advent of devtools and Rcpp.
Appendix A, Real-Life Resources: A listing of our favorite resources for learning more about R
and interacting with the community.
Appendix B, Glossary: A glossary of terms used throughout this book. A good deal of the text in
this book is either R code or the results of running code. Code and results are most often in a separate
block of text and set in a distinctive font, as shown in the following example. The different parts of
code also have different colors. Lines of code start with >, and if code is continued from one line to
another the continued line begins with +.
> # this is a comment
>
> # now basic math
> 10 * 10
[1] 100
>
> # calling a function
> sqrt(4)
[1] 2
Certain Kindle devices do not display color so the digital edition of this book will be viewed in
greyscale on those devices.
There are occasions where code is shown inline and looks like sqrt(4).
In the few places where math is necessary, the equations are indented from the margin and are
numbered.
Within equations, normal variables appear as italic text (x), vectors are bold lowercase letters (x)
and matrices are bold uppercase letters (X). Greek letters, such as α and β, follow the same
convention.
Function names will be written as join and package names as plyr. Objects generated in code
that are referenced in text are written as object1.
Learning R is a gratifying experience that makes life so much easier for so many tasks. I hope you
enjoy learning with me.
Acknowledgments
To start, I must thank my mother, Gail Lander, for encouraging me to become a math major. Without
that I would never have followed the path that led me to statistics and data science. In a similar vein, I
have to thank my father, Howard Lander, for paying all those tuition bills. He has been a valuable
source of advice and guidance throughout my life and someone I have aspired to emulate in many
ways. While they both insist they do not understand what I do, they love that I do it and have helped
me all along the way. Staying with family, I should thank my sister and brother-in-law, Aimee and
Eric Schechterman, for letting me teach math to Noah, their five-year-old son.
There are many teachers who have helped shape me over the years. The first is Rochelle Lecke,
who tutored me in middle school math even when my teacher told me I did not have worthwhile math
skills.
Then there is Beth Edmondson, my precalc teacher at Princeton Day School. After I wasted the first
half of high school as a mediocre student, she told me I had “some nerve signing up for next year’s AP
Calc” given my grades. She agreed to let me take AP Calc if I went from a C to an A+ in her class,
never thinking I stood a chance. Three months later, she was in shock as I not only earned the A+, but
turned around my entire academic career. She changed my life and without her, I do not know where I
would be today. I am forever grateful that she was my teacher.
For the first two years at Muhlenberg College, I was determined to be a business and
communications major, but took math classes because they came naturally to me. My professors, Dr.
Penny Dunham, Dr. Bill Dunham, and Dr. Linda McGuire, all convinced me to become a math major,
a decision that has greatly shaped my life. Dr. Greg Cicconetti gave me my first glimpse of rigorous
statistics, my first research opportunity and planted the idea in my head that I should go to grad school
for statistics.
While earning my M.A. at Columbia University, I was surrounded by brilliant minds in statistics
and programming. Dr. David Madigan opened my eyes to modern machine learning, and Dr. Bodhi
Sen got me thinking about statistical programming. I had the privilege to do research with Dr. Andrew
Gelman, whose insights have been immeasurably important to me. Dr. Richard Garfield showed me
how to use statistics to help people in disaster and war zones when he sent me on my first assignment
to Myanmar. His advice and friendship over the years have been dear to me. Dr. Jingchen Liu
allowed and encouraged me to write my thesis on New York City pizza, which has brought me an
inordinate amount of attention.
1
1. http://slice.seriouseats.com/archives/2010/03/the-moneyball-of-pizza-statistician-
uses-statistics-to-find-nyc-best-pizza.html
While at Columbia, I also met my good friend—and one time TA—Dr. Ivor Cribben who filled in
so many gaps in my knowledge. Through him, I met Dr. Rachel Schutt, a source of great advice, and
who I am now honored to teach alongside at Columbia.
Grad school might never have happened without the encouragement and support of Shanna Lee. She
helped maintain my sanity while I was incredibly overcommited to two jobs, classes and Columbia’s
hockey team. I am not sure I would have made it through without her.
Steve Czetty gave me my first job in analytics at Sky IT Group and taught me about databases,
while letting me experiment with off-the-wall programming. This sparked my interest in statistics and
data. Joe DeSiena, Philip du Plessis, and Ed Bobrin at the Bardess Group are some of the finest
people I have ever had the pleasure to work with, and I am proud to be working with them to this day.
Mike Minelli, Rich Kittler, Mark Barry, David Smith, Joseph Rickert, Dr. Norman Nie, James
Peruvankal, Neera Talbert and Dave Rich at Revolution Analytics let me do one of the best jobs I
could possibly imagine: explaining to people in business why they should be using R. Kirk Mettler,
Richard Schultz, Dr. Bryan Lewis and Jim Winfield at Big Computing encouraged me to have fun,
tackling interesting problems in R. Vincent Saulys, John Weir, and Dr. Saar Golde at Goldman Sachs
made my time there both enjoyable and educational.
Throughout the course of writing this book, many people helped me with the process. First and
foremost is Yin Cheung, who saw all the stress I constantly felt and supported me through many
ruined nights and days.
My editor, Debra Williams, knew just how to encourage me and her guiding hand has been
invaluable. Paul Dix, the series editor and a good friend, was the person who suggested I write this
book, so none of this would have happened without him. Thanks to Caroline Senay and Andrea Fox
for being great copy editors. Without them, this book would not be nearly as well put together. Robert
Mauriello’s technical review was incredibly useful in honing the book’s presentation.
The folks at RStudio, particularly JJ Allaire and Josh Paulson, make an amazing product, which
made the writing process far easier than it would have been otherwise. Yihui Xie, the author of the
knitr package, provided numerous feature changes that I needed to write this book. His software,
and his speed at implementing my requests, is greatly appreciated.
Numerous people have provided valuable feedback as I produced this book, including Chris
Bethel, Dr. Dirk Eddelbuettel, Dr. Ramnath Vaidyanathan, Dr. Eran Bellin, Avi Fisher, Brian Ezra,
Paul Puglia, Nicholas Galasinao, Aaron Schumaker, Adam Hogan, Jeffrey Arnold, and John Houston.
Last fall was my first time teaching, and I am thankful to the students from the Fall 2012
Introduction to Data Science class at Columbia University for being the guinea pigs for the material
that ultimately ended up in this book.
Thank you to everyone who helped along the way.
About the Author
Jared P. Lander is the founder and CEO of Lander Analytics, a statistical consulting firm based in
New York City, the organizer of the New York Open Statistical Programming Meetup, and an adjunct
professor of statistics at Columbia University. He is also a tour guide for Scott’s Pizza Tours and an
advisor to Brewla Bars, a gourmet ice pop start-up. With an M.A. from Columbia University in
statistics and a B.A. from Muhlenberg College in mathematics, he has experience in both academic
research and industry. His work for both large and small organizations spans politics, tech start-ups,
fund-raising, music, finance, healthcare and humanitarian relief efforts.
He specializes in data management, multilevel models, machine learning, generalized linear
models, visualization, data management and statistical computing.
Chapter 1. Getting R
R is a wonderful tool for statistical analysis, visualization and reporting. Its usefulness is best seen in
the wide variety of fields where it is used. We alone have used R for projects with banks, political
campaigns, tech startups, food startups, international development and aid organizations, hospitals
and real estate developers. Other areas where we have seen it used are online advertising, insurance,
ecology, genetics and pharmaceuticals. R is used by statisticians with advanced machine learning
training and by programmers familiar with other languages, and also by people who are not
necessarily trained in advanced data analysis but are tired of using Excel.
Before it can be used it needs to be downloaded and installed, a process that is no more
complicated than installing any other program.
1.1. Downloading R
The first step in using R is getting it on the computer. Unlike with languages such as C++, R must be
installed in order to run.
1
The program is easily obtainable from the Comprehensive R Archive
Network (CRAN), the maintainer of R, at http://cran.r-project.org/. At the top of the
page are links to download R for Windows, Mac OS X and Linux.
1. Technically C++ cannot be set up on its own without a compiler, so something would still need to be installed anyway.
There are prebuilt installations available for Windows and Mac OS X while those for Linux
usually compile from source. Installing R on any of these platforms is just like installing any other
program.
Windows users should click the link Download R for Windows, then base and then Download R
3.x.x for Windows; the x’s indicate the version of R. This changes periodically as improvements are
made.
Similarly, Mac users should click Download R for (Mac) OS X and then R-3.x.x.pkg; again, the x’s
indicate the current version of R. This will also install both 32- and 64-bit versions.
Linux users should download R using their standard distribution mechanism whether that is apt-get
(Ubuntu and Debian), zypper (SUSE) or another source. This will also build and install R.
1.2. R Version
As of this writing, R is at version 3.0.2, which is a big jump from the previous version, 2.15.3.
CRAN follows a one-year release cycle where each major version change increases the middle of the
three numbers in the version. For instance, version 3.0.0 was released in 2013. In 2014 the version
will be incremented to 3.1.0 with 3.2.0 coming in 2015. The last number in the version is for minor
updates to the current major version.
Most R functionality is usually backward compatible with previous versions.
1.3. 32-bit versus 64-bit
The choice between using 32-bit and using 64-bit comes down to whether the computer supports 64-
bit—most new machines do—and the size of the data to be worked with. The 64-bit versions can
address arbitrarily large amounts of memory (or RAM) so it might as well be used.
This is especially important starting with version 3.0.0, as that adds support for 64-bit integers,
剩余826页未读,继续阅读
qq_28122231
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 深入理解23种设计模式
- 制作与调试:声控开关电路详解
- 腾讯2008年软件开发笔试题解析
- WebService开发指南:从入门到精通
- 栈数据结构实现的密码设置算法
- 提升逻辑与英语能力:揭秘IBM笔试核心词汇及题型
- SOPC技术探索:理论与实践
- 计算图中节点介数中心性的函数
- 电子元器件详解:电阻、电容、电感与传感器
- MIT经典:统计自然语言处理基础
- CMD命令大全详解与实用指南
- 数据结构复习重点:逻辑结构与存储结构
- ACM算法必读书籍推荐:权威指南与实战解析
- Ubuntu命令行与终端:从Shell到rxvt-unicode
- 深入理解VC_MFC编程:窗口、类、消息处理与绘图
- AT89S52单片机实现的温湿度智能检测与控制系统
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功