当前位置:主页 > 课后答案 > 数据挖掘习题答案
数据挖掘导论

《数据挖掘导论》课后习题答案

  • 更新:2021-05-14
  • 大小:17.6 MB
  • 类别:数据挖掘
  • 作者:Pang-Ning、Ta、Michael
  • 出版:机械工业出版社
  • 格式:PDF

  • 资源介绍
  • 相关推荐

本书全面介绍了数据挖掘的理论和方法,着重介绍如何用数据挖掘知识解决各种实际问题,涉及学科领域众多,适用面广。

书中涵盖5个主题:数据、分类、关联分析、聚类和异常检测。除异常检测外,每个主题都包含两章:前面一章讲述基本概念、代表性算法和评估技术,后面一章较深入地讨论高级概念和算法。目的是使读者在透彻地理解数据挖掘基础的同时,还能了解更多重要的高级主题。

本书特色

·包含大量的图表、综合示例和丰富的习题。

·不需要数据库背景,只需要很少的统计学或数学背景知识。

·网上配套教辅资源丰富,包括ppt、习题解答、数据集等。

目录

  • preface v
  • 1 introduction 1
  • 1.1 what is data mining 2
  • 1.2 motivating challenges 4
  • 1.3 the origins of data mining 6
  • 1.4 data mining tasks 7
  • 1.5 scope and organization of the book 11
  • 1.6 bibliographic notes 13
  • 1.7 exercises 16
  • 2 data 19
  • 2.1 types of data 22
  • 2.1.1 attributes and measurement 23
  • 2.1.2 types of data sets 29
  • 2.2 data quality 36
  • 2.2.1 measurement and data collection issues 37
  • 2.2.2 issues related to applications 43
  • 2.3 data preprocessing 44
  • 2.3.1 aggregation 45
  • 2.3.2 sampling 47
  • 2.3.3 dimensionality reduction 50
  • .2.3.4 feature subset selection 52
  • 2.3.5 feature creation 55
  • 2.3.6 discretization and binarization 57
  • 2.3.7 variable transformation 63
  • 2.4 measures of similarity and dissimilarity 65
  • 2.4.1 basics 66
  • 2.4.2 similarity and dissimilarity between simple attributes67
  • 2.4.3 dissimilarities between data objects 69
  • 2.4.4 similarities between data objects 72
  • 2.4.5 examples of proximity measures 73
  • 2.4.6 issues in proximity calculation 80
  • 2.4.7 selecting the right proximity measure 83
  • 2.5 bibliographic notes 84
  • 2.6 exercises 88
  • 3 exploring data 97
  • 3.1 the iris data set 98
  • 3.2 summary statistics 98
  • 3.2.1 frequencies and the mode 99
  • 3.2.2 percentiles 100
  • 3.2.3 measures of location: mean and median 101
  • 3.2.4 measures of spread: range and variance 102
  • 3.2.5 multivariate summary statistics 104
  • 3.2.6 other ways to summarize the data 105
  • 3.3 visualization 105
  • 3.3.1 motivations for visualization 105
  • 3.3.2 general concepts 106
  • 3.3.3 techniques 110
  • 3.3.4 visualizing higher-dimensional data 124
  • 3.3.5 do’s and don’ts 130
  • 3.4 olap and multidimensional data analysis 131
  • 3.4.1 representing iris data as a multidimensional array 131
  • 3.4.2 multidimensional data: the general case 133
  • 3.4.3 analyzing multidimensional data 135
  • 3.4.4 final comments on multidimensional data analysis 139
  • 3.5 bibliographic notes 139
  • 3.6 exercises 141
  • 4 classification:
  • basic concepts, decision trees, and model evaluation 145
  • 4.1 preliminaries 146
  • 4.2 general approach to solving a classification problem 148
  • 4.3 decision tree induction 150
  • 4.3.1 how a decision tree works 150
  • 4.3.2 how to build a decision tree 151
  • 4.3.3 methods for expressing attribute test conditions 155
  • 4.3.4 measures for selecting the best split 158
  • 4.3.5 algorithm for decision tree induction 164
  • 4.3.6 an example: web robot detection 166
  • contents xi
  • 4.3.7 characteristics of decision tree induction 168
  • 4.4 model overfitting 172
  • 4.4.1 overfitting due to presence of noise 175
  • 4.4.2 overfitting due to lack of representative samples 177
  • 4.4.3 overfitting and the multiple comparison procedure 178
  • 4.4.4 estimation of generalization errors 179
  • 4.4.5 handling overfitting in decision tree induction 184
  • 4.5 evaluating the performance of a classifier 186
  • 4.5.1 holdout method 186
  • 4.5.2 random subsampling 187
  • 4.5.3 cross-validation 187
  • 4.5.4 bootstrap 188
  • 4.6 methods for comparing classifiers 188
  • 4.6.1 estimating a confidence interval for accuracy 189
  • 4.6.2 comparing the performance of two models 191
  • 4.6.3 comparing the performance of two classifiers 192
  • 4.7 bibliographic notes 193
  • 4.8 exercises 198
  • 5 classification: alternative techniques 207
  • 5.1 rule-based classifier 207
  • 5.1.1 how a rule-based classifier works 209
  • 5.1.2 rule-ordering schemes 211
  • 5.1.3 how to build a rule-based classifier 212
  • 5.1.4 direct methods for rule extraction 213
  • 5.1.5 indirect methods for rule extraction 221
  • 5.1.6 characteristics of rule-based classifiers 223
  • 5.2 nearest-neighbor classifiers 223
  • 5.2.1 algorithm 225
  • 5.2.2 characteristics of nearest-neighbor classifiers 226
  • 5.3 bayesian classifiers 227
  • 5.3.1 bayes theorem 228
  • 5.3.2 using the bayes theorem for classification 229
  • 5.3.3 na¨ve bayes classifier 231
  • 5.3.4 bayes error rate 238
  • 5.3.5 bayesian belief networks 240
  • 5.4 artificial neural network (ann) 246
  • 5.4.1 perceptron 247
  • 5.4.2 multilayer artificial neural network 251
  • 5.4.3 characteristics of ann 255
  • xii contents
  • 5.5 support vector machine (svm) 256
  • 5.5.1 maximum margin hyperplanes 256
  • 5.5.2 linear svm: separable case 259
  • 5.5.3 linear svm: nonseparable case 266
  • 5.5.4 nonlinear svm 270
  • 5.5.5 characteristics of svm 276
  • 5.6 ensemble methods 276
  • 5.6.1 rationale for ensemble method 277
  • 5.6.2 methods for constructing an ensemble classifier 278
  • 5.6.3 bias-variance decomposition 281
  • 5.6.4 bagging 283
  • 5.6.5 boosting 285
  • 5.6.6 random forests 290
  • 5.6.7 empirical comparison among ensemble methods 294
  • 5.7 class imbalance problem 294
  • 5.7.1 alternative metrics 295
  • 5.7.2 the receiver operating characteristic curve 298
  • 5.7.3 cost-sensitive learning 302
  • 5.7.4 sampling-based approaches 305
  • 5.8 multiclass problem 306
  • 5.9 bibliographic notes 309
  • 5.10 exercises 315
  • 6 association analysis: basic concepts and algorithms 327
  • 6.1 problem definition 328
  • 6.2 frequent itemset generation 332
  • 6.2.1 the apriori principle 333
  • 6.2.2 frequent itemset generation in the apriori algorithm335
  • 6.2.3 candidate generation and pruning 338
  • 6.2.4 support counting 342
  • 6.2.5 computational complexity 345
  • 6.3 rule generation 349
  • 6.3.1 confidence-based pruning 350
  • 6.3.2 rule generation in apriori algorithm 350
  • 6.3.3 an example: congressional voting records 352
  • 6.4 compact representation of frequent itemsets 353
  • 6.4.1 maximal frequent itemsets 354
  • 6.4.2 closed frequent itemsets 355
  • 6.5 alternative methods for generating frequent itemsets 359
  • 6.6 fp-growth algorithm 363
  • contents xiii
  • 6.6.1 fp-tree representation 363
  • 6.6.2 frequent itemset generation in fp-growth algorithm366
  • 6.7 evaluation of association patterns 370
  • 6.7.1 objective measures of interestingness 371
  • 6.7.2 measures beyond pairs of binary variables 382
  • 6.7.3 simpson’s paradox 384
  • 6.8 effect of skewed support distribution 386
  • 6.9 bibliographic notes 390
  • 6.10 exercises 404
  • 7 association analysis: advanced concepts 415
  • 7.1 handling categorical attributes 415
  • 7.2 handling continuous attributes 418
  • 7.2.1 discretization-based methods 418
  • 7.2.2 statistics-based methods 422
  • 7.2.3 non-discretization methods 424
  • 7.3 handling a concept hierarchy 426
  • 7.4 sequential patterns 429
  • 7.4.1 problem formulation 429
  • 7.4.2 sequential pattern discovery 431
  • 7.4.3 timing constraints 436
  • 7.4.4 alternative counting schemes 439
  • 7.5 subgraph patterns 442
  • 7.5.1 graphs and subgraphs 443
  • 7.5.2 frequent subgraph mining 444
  • 7.5.3 apriori -like method 447
  • 7.5.4 candidate generation 448
  • 7.5.5 candidate pruning 453
  • 7.5.6 support counting 457
  • 7.6 infrequent patterns 457
  • 7.6.1 negative patterns 458
  • 7.6.2 negatively correlated patterns 458
  • 7.6.3 comparisons among infrequent patterns, negative patterns,and negatively correlated patterns 460
  • 7.6.4 techniques for mining interesting infrequent patterns 461
  • 7.6.5 techniques based on mining negative patterns 463
  • 7.6.6 techniques based on support expectation 465
  • 7.7 bibliographic notes 469
  • 7.8 exercises 473
  • xiv contents
  • 8 cluster analysis: basic concepts and algorithms 487
  • 8.1 overview 490
  • 8.1.1 what is cluster analysis 490
  • 8.1.2 different types of clusterings 491
  • 8.1.3 different types of clusters 493
  • 8.2 k-means 496
  • 8.2.1 the basic k-means algorithm 497
  • 8.2.2 k-means: additional issues 506
  • 8.2.3 bisecting k-means 508
  • 8.2.4 k-means and different types of clusters 510
  • 8.2.5 strengths and weaknesses 510
  • 8.2.6 k-means as an optimization problem 513
  • 8.3 agglomerative hierarchical clustering 515
  • 8.3.1 basic agglomerative hierarchical clustering algorithm 516
  • 8.3.2 specific techniques 518
  • 8.3.3 the lance-williams formula for cluster proximity 524
  • 8.3.4 key issues in hierarchical clustering 524
  • 8.3.5 strengths and weaknesses 526
  • 8.4 dbscan 526
  • 8.4.1 traditional density: center-based approach 527
  • 8.4.2 the dbscan algorithm 528
  • 8.4.3 strengths and weaknesses 530
  • 8.5 cluster evaluation 532
  • 8.5.1 overview 533
  • 8.5.2 unsupervised cluster evaluation using cohesion and
  • separation 536
  • 8.5.3 unsupervised cluster evaluation using the proximity
  • matrix 542
  • 8.5.4 unsupervised evaluation of hierarchical clustering 544
  • 8.5.5 determining the correct number of clusters 546
  • 8.5.6 clustering tendency 547
  • 8.5.7 supervised measures of cluster validity 548
  • 8.5.8 assessing the significance of cluster validity measures553
  • 8.6 bibliographic notes 555
  • 8.7 exercises 559
  • 9 cluster analysis: additional issues and algorithms 569
  • 9.1 characteristics of data, clusters, and clustering algorithms570
  • 9.1.1 example: comparing k-means and dbscan 570
  • 9.1.2 data characteristics 571
  • contents xv
  • 9.1.3 cluster characteristics 573
  • 9.1.4 general characteristics of clustering algorithms 575
  • 9.2 prototype-based clustering 577
  • 9.2.1 fuzzy clustering 577
  • 9.2.2 clustering using mixture models 583
  • 9.2.3 self-organizing maps (som) 594
  • 9.3 density-based clustering 600
  • 9.3.1 grid-based clustering 601
  • 9.3.2 subspace clustering 604
  • 9.3.3 denclue: a kernel-based scheme for density-based
  • clustering 608
  • 9.4 graph-based clustering 612
  • 9.4.1 sparsification 613
  • 9.4.2 minimum spanning tree (mst) clustering 614
  • 9.4.3 opossum: optimal partitioning of sparse similarities
  • using metis 616
  • 9.4.4 chameleon: hierarchical clustering with dynamic
  • modeling 616
  • 9.4.5 shared nearest neighbor similarity 622
  • 9.4.6 the jarvis-patrick clustering algorithm 625
  • 9.4.7 snn density 627
  • 9.4.8 snn density-based clustering 629
  • 9.5 scalable clustering algorithms 630
  • 9.5.1 scalability: general issues and approaches 630
  • 9.5.2 birch 633
  • 9.5.3 cure 635
  • 9.6 which clustering algorithm 639
  • 9.7 bibliographic notes 643
  • 9.8 exercises 647
  • 10 anomaly detection 651
  • 10.1 preliminaries 653
  • 10.1.1 causes of anomalies 653
  • 10.1.2 approaches to anomaly detection 654
  • 10.1.3 the use of class labels 655
  • 10.1.4 issues 656
  • 10.2 statistical approaches 658
  • 10.2.1 detecting outliers in a univariate normal distribution 659
  • 10.2.2 outliers in a multivariate normal distribution 661
  • 10.2.3 a mixture model approach for anomaly detection 662
  • xvi contents
  • 10.2.4 strengths and weaknesses 665
  • 10.3 proximity-based outlier detection 666
  • 10.3.1 strengths and weaknesses 666
  • 10.4 density-based outlier detection 668
  • 10.4.1 detection of outliers using relative density 669
  • 10.4.2 strengths and weaknesses 670
  • 10.5 clustering-based techniques 671
  • 10.5.1 assessing the extent to which an object belongs to a
  • cluster 672
  • 10.5.2 impact of outliers on the initial clustering 674
  • 10.5.3 the number of clusters to use 674
  • 10.5.4 strengths and weaknesses 674
  • 10.6 bibliographic notes 675
  • 10.7 exercises 680
  • appendix a linear algebra 685
  • a.1 vectors 685
  • a.1.1 definition 685
  • a.1.2 vector addition and multiplication by a scalar 685
  • a.1.3 vector spaces 687
  • a.1.4 the dot product, orthogonality, and orthogonal
  • projections 688
  • a.1.5 vectors and data analysis 690
  • a.2 matrices 691
  • a.2.1 matrices: definitions 691
  • a.2.2 matrices: addition and multiplication by a scalar 692
  • a.2.3 matrices: multiplication 693
  • a.2.4 linear transformations and inverse matrices 695
  • a.2.5 eigenvalue and singular value decomposition 697
  • a.2.6 matrices and data analysis 699
  • a.3 bibliographic notes 700
  • appendix b dimensionality reduction 701
  • b.1 pca and svd 701
  • b.1.1 principal components analysis (pca) 701
  • b.1.2 svd 706
  • b.2 other dimensionality reduction techniques 708
  • b.2.1 factor analysis 708
  • b.2.2 locally linear embedding (lle) 710
  • b.2.3 multidimensional scaling, fastmap, and isomap 712
  • contents xvii
  • b.2.4 common issues 715
  • b.3 bibliographic notes 716
  • appendix c probability and statistics 719
  • c.1 probability 719
  • c.1.1 expected values 722
  • c.2 statistics 723
  • c.2.1 point estimation 724
  • c.2.2 central limit theorem 724
  • c.2.3 interval estimation 725
  • c.3 hypothesis testing 726
  • appendix d regression 729
  • d.1 preliminaries 729
  • d.2 simple linear regression 730
  • d.2.1 least square method 731
  • d.2.2 analyzing regression errors 733
  • d.2.3 analyzing goodness of fit 735
  • d.3 multivariate linear regression 736
  • d.4 alternative least-square regression methods 737
  • appendix e optimization 739
  • e.1 unconstrained optimization 739
  • e.1.1 numerical methods 742
  • e.2 constrained optimization 746
  • e.2.1 equality constraints 746
  • e.2.2 inequality constraints 747
  • author index 750
  • subject index 758
  • copyright permissions 769
  • xviii contents

资源下载

资源下载地址1:https://pan.baidu.com/s/1BxfEbIb3FW-J48_DO5qOUA

相关资源

网友留言