数据科学驱动商业：洞察大数据的价值

需积分: 7 19 浏览量更新于2024-07-18 收藏 17.64MB PDF 举报

"Data Science for Business" 是一本深入探讨数据科学如何驱动现代商业的重要书籍，由业界专家共同推荐。本书揭示了在大数据时代，数据已成为商业的核心，思考业务时必须结合数据分析的观念。本书旨在帮助读者理解数据背后的科学原理，无论他们是数据科学家、企业经理还是与数据打交道的专业人士。全球副总裁 Craig Vaughan 认为这本书是拥抱大数据机遇的必读资源，而首席数据官 Ron Bekkerman 强调了数据在现代商业中的关键地位，认为阅读此书能让人理解数据驱动决策的科学方法。 Ronny Kohavi，微软在线服务部门的合伙人架构师，推荐这本书给希望更深入理解数据科学原则和算法，但不需深入技术细节的商务管理者。他指出这本书提供了一个跨学科的综合视角。《数据挖掘与知识发现》期刊的主编 Geoff Webb 称赞 Provost 和 Fawcett 将他们在实际数据分析领域的深厚造诣转化为无与伦比的入门指南。首席科学家 Claudia Perlich 表达了对这本书的高度评价，她希望所有与她共事的人都能阅读这本书，因为这将有助于提升团队在数据驱动广告和研究方面的理解。这本书被视作数据科学基础知识的基石，对于那些希望在业务中有效利用数据的人来说，是一本极具价值的参考资料。本书涵盖了数据科学的关键概念，如数据收集、清洗、建模、预测和解释。它还探讨了如何通过数据驱动的决策制定来提升业务效率和创新，以及如何在实践中应用这些知识来解决实际的商业问题。此外，书中可能还会涉及数据伦理、隐私保护和大数据技术的最新发展，帮助读者全面了解数据科学在商业环境中的角色和影响。通过阅读 "Data Science for Business"，读者不仅可以学习到数据分析的基本方法，还能掌握如何将这些方法应用于实际业务场景，从而提高公司的竞争力和适应快速变化的市场环境的能力。这本书是任何对数据科学在商业领域应用感兴趣的人的宝贵资源，无论他们是在大公司还是初创企业，无论他们的专业背景是技术还是管理。

want to design/implement top-notch data science solutions to business problems, we

all need to have a common understanding of this material.

Colleagues also tell us that the book has been quite useful in an unforeseen way: for

preparing to interview data science job candidates. The demand from business for hiring

data scientists is strong and increasing. In response, more and more job seekers are

presenting themselves as data scientists. Every data science job candidate should un‐

derstand the fundamentals presented in this book. (Our industry colleagues tell us that

they are surprised how many do not. We have half-seriously discussed a follow-up

pamphlet “Cliff’s Notes to Interviewing for Data Science Jobs.”)

Our Conceptual Approach to Data Science

In this book we introduce a collection of the most important fundamental concepts of

data science. Some of these concepts are “headliners” for chapters, and others are in‐

troduced more naturally through the discussions (and thus they are not necessarily

labeled as fundamental concepts). The concepts span the process from envisioning the

problem, to applying data science techniques, to deploying the results to improve

decision-making. The concepts also undergird a large array of business analytics meth‐

ods and techniques.

The concepts fit into three general types:

1. Concepts about how data science fits in the organization and the competitive land‐

scape, including ways to attract, structure, and nurture data science teams; ways for

thinking about how data science leads to competitive advantage; and tactical con‐

cepts for doing well with data science projects.

2. General ways of thinking data-analytically. These help in identifying appropriate

data and consider appropriate methods. The concepts include the data mining pro‐

cess as well as the collection of different high-level data mining tasks.

3. General concepts for actually extracting knowledge from data, which undergird the

vast array of data science tasks and their algorithms.

For example, one fundamental concept is that of determining the similarity of two

entities described by data. This ability forms the basis for various specific tasks. It may

be used directly to find customers similar to a given customer. It forms the core of several

prediction algorithms that estimate a target value such as the expected resouce usage of

a client or the probability of a customer to respond to an offer. It is also the basis for

clustering techniques, which group entities by their shared features without a focused

objective. Similarity forms the basis of information retrieval, in which documents or

webpages relevant to a search query are retrieved. Finally, it underlies several common

algorithms for recommendation. A traditional algorithm-oriented book might present

each of these tasks in a different chapter, under different names, with common aspects

xii | Preface

1. Of course, each author has the distinct impression that he did the majority of the work on the book.

buried in algorithm details or mathematical propositions. In this book we instead focus

on the unifying concepts, presenting specific tasks and algorithms as natural manifes‐

tations of them.

As another example, in evaluating the utility of a pattern, we see a notion of lift— how

much more prevalent a pattern is than would be expected by chance—recurring broadly

across data science. It is used to evaluate very different sorts of patterns in different

contexts. Algorithms for targeting advertisements are evaluated by computing the lift

one gets for the targeted population. Lift is used to judge the weight of evidence for or

against a conclusion. Lift helps determine whether a co-occurrence (an association) in

data is interesting, as opposed to simply being a natural consequence of popularity.

We believe that explaining data science around such fundamental concepts not only

aids the reader, it also facilitates communication between business stakeholders and

data scientists. It provides a shared vocabulary and enables both parties to understand

each other better. The shared concepts lead to deeper discussions that may uncover

critical issues otherwise missed.

To the Instructor

This book has been used successfully as a textbook for a very wide variety of data science

courses. Historically, the book arose from the development of Foster’s multidisciplinary

Data Science classes at the Stern School at NYU, starting in the fall of 2005.

The original

class was nominally for MBA students and MSIS students, but drew students from

schools across the university. The most interesting aspect of the class was not that it

appealed to MBA and MSIS students, for whom it was designed. More interesting, it

also was found to be very valuable by students with strong backgrounds in machine

learning and other technical disciplines. Part of the reason seemed to be that the focus

on fundamental principles and other issues besides algorithms was missing from their

curricula.

At NYU we now use the book in support of a variety of data science–related programs:

the original MBA and MSIS programs, undergraduate business analytics, NYU/Stern’s

new MS in Business Analytics program, and as the Introduction to Data Science for

NYU’s new MS in Data Science. In addition, (prior to publication) the book has been

adopted by more than a dozen other universities for programs in seven countries (and

counting), in business schools, in computer science programs, and for more general

introductions to data science.

Stay tuned to the books’ websites (see below) for information on how to obtain helpful

instructional material, including lecture slides, sample homework questions and prob‐

Preface | xiii

lems, example project instructions based on the frameworks from the book, exam ques‐

tions, and more to come.

We keep an up-to-date list of known adoptees on the book’s website.

Click Who’s Using It at the top.

Other Skills and Concepts

There are many other concepts and skills that a practical data scientist needs to know

besides the fundamental principles of data science. These skills and concepts will be

discussed in Chapter 1 and Chapter 2. The interested reader is encouraged to visit the

book’s website for pointers to material for learning these additional skills and concepts

(for example, scripting in Python, Unix command-line processing, datafiles, common

data formats, databases and querying, big data architectures and systems like MapRe‐

duce and Hadoop, data visualization, and other related topics).

Sections and Notation

In addition to occasional footnotes, the book contains boxed “sidebars.” These are es‐

sentially extended footnotes. We reserve these for material that we consider interesting

and worthwhile, but too long for a footnote and too much of a digression for the main

text.

A note on the starred, “curvy road” sections

The occasional mathematical details are relegated to optional “starred”

sections. These section titles will have asterisk prefixes, and they will

include the “curvy road” graphic you see to the left to indicate that the

section contains more detailed mathematics or technical details than

elsewhere. The book is written so that these sections may be skipped

without loss of continuity, although in a few places we remind readers

that details appear there.

Constructions in the text like (Smith and Jones, 2003) indicate a reference to an entry

in the bibliography (in this case, the 2003 article or book by Smith and Jones); “Smith

and Jones (2003)” is a similar reference. A single bibliography for the entire book appears

in the endmatter.

In this book we try to keep math to a minimum, and what math there is we have sim‐

plified as much as possible without introducing confusion. For our readers with tech‐

nical backgrounds, a few comments may be in order regarding our simplifying choices.

xiv | Preface

剩余408页未读，继续阅读

Jekity

粉丝: 0
资源: 6

数据科学驱动商业：洞察大数据的价值

Data Science for Business数据科学

Data Science For Dummies

DataScience4Business-group:商业数据科学团队回购

Data_Science_For_Business_Applications

Data Science For Dummies(2ed,2017)

2021最新直播系统+短视频源码+教程+演示APP+开发文档+IOS与安卓源码

基于ssm的智能卤菜销售平台源码（java毕业设计完整源码+LW）.zip

基于ssm的影片推荐系统源码（java毕业设计完整源码）.zip

4wb041-横塘小学学生托管管理系统小程序_springboot+vue+uniapp.zip

Java源码springboot在线教育系统-毕业设计论文-期末大作业.zip

最新资源