没有合适的资源?快使用搜索试试~ 我知道了~
首页Feature Engineering for Machine Learning - Alice Zheng
资源详情
资源评论
资源推荐

Alice Zheng & Amanda Casari
Feature
Engineering
for Machine Learning
PRINCIPLES AND TECHNIQUES FOR DATA SCIENTISTS


Alice Zheng and Amanda Casari
Feature Engineering for
Machine Learning
Principles and Techniques for Data Scientists
Boston Farnham Sebastopol
Tokyo
Beijing Boston Farnham Sebastopol
Tokyo
Beijing

978-1-491-95324-2
[LSI]
Feature Engineering for Machine Learning
by Alice Zheng and Amanda Casari
Copyright © 2018 Alice Zheng, Amanda Casari. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/insti‐
tutional sales department: 800-998-9938 or corporate@oreilly.com.
Editors: Rachel Roumeliotis and Jeff Bleiel Indexer: Ellen Troutman
Production Editor: Kristen Brown Interior Designer: David Futato
Copyeditor: Rachel Head Cover Designer: Karen Montgomery
Proofreader: Sonia Saruba Illustrator: Rebecca Demarest
April 2018: First Edition
Revision History for the First Edition
2018-03-23: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781491953242 for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Feature Engineering for Machine
Learning, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
While the publisher and the authors have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.

Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1.
The Machine Learning Pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Data 1
Tasks 1
Models 2
Features 3
Model Evaluation 3
2.
Fancy Tricks with Simple Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Scalars, Vectors, and Spaces 6
Dealing with Counts 8
Binarization 9
Quantization or Binning 10
Log Transformation 15
Log Transform in Action 19
Power Transforms: Generalization of the Log Transform 23
Feature Scaling or Normalization 29
Min-Max Scaling 30
Standardization (Variance Scaling) 31
ℓ
2
Normalization 32
Interaction Features 35
Feature Selection 38
Summary 39
Bibliography 39
3.
Text Data: Flattening, Filtering, and Chunking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Bag-of-X: Turning Natural Text into Flat Vectors 42
iii
剩余216页未读,继续阅读
安全验证
文档复制为VIP权益,开通VIP直接复制

评论0